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Welcome 


As the head of Advanced Designations at Kaplan Schweser, I am pleased to have the 
opportunity to help you prepare for the FRM® exam. Kaplan Schweser has decades of 
experience in delivering the most effective FRM exam prep products in the market 
and I know you will find them to be invaluable in your studies. 


Our products are designed to be an integrated study solution across print and digital 
media to provide you with the best learning experience, whether you are studying 
with a physical book, online, or on your mobile device. 


Our core product, the SchweserNotes , addresses all Topic Areas, Readings, and 
Learning Objectives in the FRM curriculum. Each reading in the SchweserNotes has 
been broken into smaller, bite-sized modules with Module Quizzes interspersed 
throughout to help you continually assess your comprehension. Topic Quizzes and 
Checkpoint Exams appear online to help you gauge your knowledge of the material 
before you move on to the next section. 


All purchasers of the SchweserNotes receive online access to the Kaplan Schweser 
online platform (our learning management system or LMS) at www.Schweser.com. In 
the LMS, you will see a dashboard that tracks your overall progress and performance 
as well as an Activity Feed, which provides structure and organization to the tasks 
required to prepare for the FRM exam. You also have access to the online versions of 
the SchweserNotes and Module Quizzes. Look for the icons indicating where Module 
Quizzes are available online. I strongly encourage you to enter your Module Quiz 
answers online and use the dashboard to track your progress and stay motivated. 


Again, thank you for trusting Kaplan Schweser with your FRM exam preparation. 
We're here to help you throughout your journey to become a certified Financial Risk 
Manager. 


Regards, 
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De. mart der fla. itt tL 


Derek Burkett, CFA, FRM, CAIA 
Vice President (Advanced Designations) 


Contact us for questions about your study package, upgrading your package, purchasing 
additional study materials, or for additional information: 


888.325.5072 (U.S.) | +1 608.779.8327 (Int’1.) 
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WELCOME TO THE 2024 
SCHWESERNOTES ~ 


Thank you for trusting Kaplan Schweser to help you reach your career and educational 
goals. We are very pleased to be able to help you prepare for the FRM Part II exam. In 
this introduction, I want to explain the resources included with the SchweserNotes, 
suggest how you can best use Kaplan Schweser materials to prepare for the exam, and 
direct you toward other educational resources you will find helpful as you study for the 
exam. 


SchweserNotes. 


The SchweserNotes consist of five volumes that include complete coverage of all FRM 
assigned readings and learning objectives (LOs), as well as module quizzes (multiple- 
choice questions for every reading) to help you master the material and check your 
retention of key concepts. 


Practice Questions 


To retain the material, it is important to quiz yourself often. We offer an online version 
of the SchweserPro™ QBank, which contains hundreds of Part II practice questions and 
explanations. We also offer Topic Quizzes and Checkpoint Exams online to further help 
you retain and apply what you have learned. 


Mock Exams 


Schweser offers four full 4-hour, 80-question practice exams. These online exams are 
important tools for gaining the speed and skills you will need to pass the exam. The 
Mock Exams contain answers with full explanations for self-grading and evaluation. 


OnDemand Class 


Our OnDemand Class provides comprehensive online instruction of every reading in 
the FRM curriculum. This video lecture series brings the personal attention of a 
classroom into your home or office with over 50 hours of instruction. The class offers 
in-depth coverage of difficult concepts as well as a discussion of sample exam 
questions. All videos are available for viewing at any time throughout the season. 
Candidates enrolled in the OnDemand Class also have the ability to email questions to 
the instructor at any time. 


Late-Season Review 


Late-season review and exam practice can make all the difference. Our OnDemand 
Review Package helps you evaluate your exam readiness with products specifically 
designed for late-season studying. This study package includes the OnDemand Review 


(20-hour archived online workshop covering essential curriculum topics) and 
Schweser’s Secret Sauce® (concise summary of the FRM curriculum). 


Part II Exam Weightings 


When preparing for the exam, be familiar with the weightings assigned to each topic 
area within the curriculum. The Part II exam weights and questions are as follows: 


Book Topic Area Ese Pas ee 
1 Market Risk Measurement and Management 20% 16 
2 Credit Risk Measurement and Management 20% 16 
3 Operational Risk and Resiliency 20% 16 
= Liquidity and Treasury Risk Measurement and 15% 12 
Management 

5 Risk Management and Investment Management 15% 12 
5 Current Issues in Financial Markets 10% 


How to Succeed 


The FRM Part II exam is a formidable challenge (covering 103 assigned readings and 
almost 600 learning objectives), so you must devote considerable time and effort to be 
properly prepared. There are no shortcuts! You must learn the material, know the 
terminology and techniques, understand the concepts, and be able to answer 80 
multiple-choice questions quickly and (at least 70%) correctly. A good estimate of the 
study time required is 300 hours on average, but some candidates will need more or 
less time, depending on their individual backgrounds and experience. 


Expect the Global Association of Risk Professionals (GARP) to test your knowledge ina 
way that will reveal how well you know the Part II curriculum. You should begin 
studying early and stick to your study plan. You should first read the SchweserNotes 
and complete the practice questions for each reading. After completing each book, you 
should answer the provided online topic quiz questions to understand how concepts 
may be tested on the exam. 


It is recommended that you finish your initial study of the entire curriculum at least 
two weeks (earlier if possible) prior to your exam date to allow sufficient time for 
practice and targeted review. During this period, you should take all of your Mock 
Exams. This final review period is when you will get a clear indication of how effective 
your study efforts have been and which readings require significant additional review. 
Answering exam-like questions across all readings and working on your exam time 
management skills will be important determinants of your success on exam day. 


Best regards, 


Eric Shraith 


Eric Smith, CFA, FRM, FDP 
Director, Advanced Designations 
Kaplan Schweser 
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Readings and Learning Objectives 


STUDY SESSION 1 


1. Estimating Market Risk Measures: An Introduction and Overview 
Kevin Dowd, Measuring Market Risk, 2nd Edition (West Sussex, UK: John Wiley & Sons, 2005). 
Chapter 3. 
After completing this reading, you should be able to: 
estimate VaR using a historical simulation approach. 
. estimate VaR using a parametric approach for both normal and lognormal return distributions. 
estimate the expected shortfall given profit and loss (P&L) or return data. 
. estimate risk measures by estimating quantiles. 
evaluate estimators of risk measures by estimating their standard errors. 
interpret quantile-quantile (QQ) plots to identify the characteristics of a distribution. 
2. Non-Parametric Approaches 
Kevin Dowd, Measuring Market Risk, 2nd Edition (West Sussex, UK: John Wiley & Sons, 2005). 
Chapter 4. 


After completing this reading, you should be able to: 
a. apply the bootstrap historical simulation approach to estimate coherent risk measures. 
b. describe historical simulation using non-parametric density estimation. 
c. compare and contrast the age-weighted, the volatility-weighted, the correlation-weighted, and the 
filtered historical simulation approaches. 
d. identify advantages and disadvantages of non-parametric estimation methods. 
3. Parametric Approaches (II): Extreme Value 
Kevin Dowd, Measuring Market Risk, 2nd Edition (West Sussex, UK: John Wiley & Sons, 2005). 
Chapter 7. 


After completing this reading, you should be able to: 
a. explain the importance and challenges of extreme values in risk management. 
b. describe extreme value theory (EVT) and its use in risk management. 
c. describe the peaks-over-threshold (POT) approach. 
d. compare and contrast the generalized extreme value (GEV) and POT approaches to estimating 
extreme risks. 
e. discuss the application of the generalized Pareto (GP) distribution in the POT approach. 
f. explain the multivariate EVT for risk management. 
4. Backtesting VaR 
Philippe Jorion, Value at Risk: The New Benchmark for Managing Financial Risk, 3rd Edition 
(New York, NY: McGraw Hill, 2007). Chapter 6. 


After completing this reading, you should be able to: 

. describe backtesting and exceptions and explain the importance of backtesting VaR models. 

. explain the significant difficulties in backtesting a VaR model. 

. verify a model based on exceptions or failure rates. 

. identify and describe Type I and Type II errors in the context of a backtesting process. 

. explain the need to consider conditional coverage in the backtesting framework. 

. describe the Basel rules for backtesting. 

5. VaR Mapping 
Philippe Jorion, Value at Risk: The New Benchmark for Managing Financial Risk, 3rd Edition 
(New York, NY: McGraw Hill, 2007). Chapter 11. 


After completing this reading, you should be able to: 

explain the principles underlying VaR mapping and describe the mapping process. 

b. explain and demonstrate how the mapping process captures general and specific risks. 
c. differentiate among the three methods for mapping portfolios of fixed-income securities. 
d. summarize how to map a fixed-income portfolio into positions of standard instruments. 
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e. describe how mapping of risk factors can support stress testing. 

f. explain how VaR can be computed and used relative to a performance benchmark. 

g. describe the method of mapping forwards, forward rate agreements, interest rate swaps, and 
options. 

6. Messages From the Academic Literature on Risk Measurement for the Trading Book 
“Messages from the Academic Literature on Risk Measurement for the Trading Book,” Basel 
Committee on Banking Supervision, Working Paper No. 19, Jan 2011. 

After completing this reading, you should be able to: 
a. explain the following lessons on VaR implementation: time horizon over which VaR is estimated, 
the recognition of time-varying volatility in VaR risk factors, and VaR backtesting. 
b. describe exogenous and endogenous liquidity risk and explain how they might be integrated into 
VaR models. 

. compare VaR, expected shortfall, and other relevant risk measures. 

. compare unified and compartmentalized risk measurement. 

. compare the results of research on top-down and bottom-up risk aggregation methods. 

describe the relationship between leverage, market value of asset, and VaR within an active 
balance sheet management framework. 


STUDY SESSION 2 


7. Correlation Basics: Definitions, Applications, and Terminology 
Gunter Meissner, Correlation Risk Modeling and Management, 2nd Edition (Risk Books, 2019). 
Chapter 1. 


After completing this reading, you should be able to: 

a. describe financial correlation risk and the areas in which it appears in finance. 

b. explain how correlation contributed to the global financial crisis of 2007-2009. 

c. describe how correlation impacts the price of quanto options as well as other multi-asset exotic 
options. 

d. describe the structure, uses, and payoffs of a correlation swap. 

e. estimate the impact of different correlations between assets in the trading book on the VaR 
capital charge. 

f. explain the role of correlation risk in market risk and credit risk. 

g. relate correlation risk to systemic and concentration risk. 

8. Empirical Properties of Correlation: How Do Correlations Behave in the Real World? 
Gunter Meissner, Correlation Risk Modeling and Management, 2nd Edition (Risk Books, 2019). 
Chapter 2. 


After completing this reading, you should be able to: 

a. describe how equity correlations and correlation volatilities behave throughout various economic 
states. 

b. calculate a mean reversion rate using standard regression and calculate the corresponding 
autocorrelation. 

c. identify the best-fit distribution for equity, bond, and default correlations. 

9. Financial Correlation Modeling—Bottom-Up Approaches 

Gunter Meissner, Correlation Risk Modeling and Management, 2nd Edition (Risk Books, 2019). 

Chapter 5, pages 126-134. 

After completing this reading, you should be able to: 

a. explain the purpose of copula functions and how they are applied in finance. 

b. describe the Gaussian copula and explain how to use it to derive the joint probability of default of 
two assets. 

c. summarize the process of finding the default time of an asset correlated to all other assets ina 
portfolio using the Gaussian copula. 


STUDY SESSION 3 


moan 


10. 


11. 


12. 


13. 


Empirical Approaches to Risk Metrics and Hedging 
Bruce Tuckman and Angel Serrat, Fixed Income Securities: Tools for Today’s Markets, 3rd 
Edition (Hoboken, NJ: John Wiley & Sons, 2011). Chapter 6. 


After completing this reading, you should be able to: 

. explain the drawbacks to using a DV01-neutral hedge for a bond position. 

. describe a regression hedge and explain how it can improve a standard DV01-neutral hedge. 

. calculate the regression hedge adjustment factor, beta. 

. calculate the face value of an offsetting position needed to carry out a regression hedge. 

. calculate the face value of multiple offsetting swap positions needed to carry out a two-variable 
regression hedge. 

compare and contrast level and change regressions. 

describe principal component analysis and explain how it is applied to constructing a hedging 
portfolio. 
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The Science of Term Structure Models 
Bruce Tuckman and Angel Serrat, Fixed Income Securities: Tools for Today’s Markets, 3rd 
Edition (Hoboken, NJ: John Wiley & Sons, 2011). Chapter 7. 


After completing this reading, you should be able to: 

a. calculate the expected discounted value of a zero-coupon security using a binomial tree. 

b. construct and apply an arbitrage argument to price a call option on a zero-coupon security using 

replicating portfolios. 

. define risk-neutral pricing and apply it to option pricing. 

d. distinguish between true and risk-neutral probabilities and apply this difference to interest rate 
drift. 

e. explain how the principles of arbitrage pricing of derivatives on fixed-income securities can be 
extended over multiple periods. 

f. define option-adjusted spread (OAS) and apply it to security pricing. 

g. describe the rationale behind the use of recombining trees in option pricing. 

h. calculate the value of a constant-maturity Treasury swap, given an interest rate tree and the risk- 
neutral probabilities. 

i. evaluate the advantages and disadvantages of reducing the size of the time steps on the pricing of 
derivatives on fixed-income securities. 


a 


j. evaluate the appropriateness of the Black-Scholes-Merton model when valuing derivatives on 


fixed-income securities. 


The Evolution of Short Rates and the Shape of the Term Structure 
Bruce Tuckman and Angel Serrat, Fixed Income Securities: Tools for Today’s Markets, 3rd 


Edition (Hoboken, NJ: John Wiley & Sons, 2011). Chapter 8. 


After completing this reading, you should be able to: 

a. explain the role of interest rate expectations in determining the shape of the term structure. 

b. apply a risk-neutral interest rate tree to assess the effect of volatility on the shape of the term 
structure. 

c. estimate the convexity effect using Jensen’s inequality. 

d. evaluate the impact of changes in maturity, yield, and volatility on the convexity of a security. 

e. calculate the price and return of a zero-coupon bond incorporating a risk premium. 


The Art of Term Structure Models: Drift 
Bruce Tuckman and Angel Serrat, Fixed Income Securities: Tools for Today’s Markets, 3rd 


Edition (Hoboken, NJ: John Wiley & Sons, 2011). Chapter 9. 


After completing this reading, you should be able to: 

a. construct and describe the effectiveness of a short-term interest rate tree assuming normally 
distributed rates, both with and without drift. 

b. calculate the short-term rate change and standard deviation of the rate change using a model 
with normally distributed rates and no drift. 

c. describe methods for addressing the possibility of negative short-term rates in term structure 
models. 

d. construct a short-term rate tree under the Ho-Lee Model with time-dependent drift. 

e. describe uses and benefits of the arbitrage-free models and assess the issue of fitting models to 
market prices. 


f. 


g. 
h. 


describe the process of constructing a simple and recombining tree for a short-term rate under 
the Vasicek Model with mean reversion. 

calculate the Vasicek Model rate change, standard deviation of the rate change, expected rate in T 
years and half-life. 

describe the effectiveness of the Vasicek Model. 


14. The Art of Term Structure Models: Volatility and Distribution 
Bruce Tuckman and Angel Serrat, Fixed Income Securities: Tools for Today’s Markets, 3rd 
Edition (Hoboken, NJ: John Wiley & Sons, 2011). Chapter 10. 


After completing this reading, you should be able to: 


a. 
b. 


f. 


describe the short-term rate process under a model with time-dependent volatility. 
calculate the short-term rate change and determine the behavior of the standard deviation of the 
rate change using a model with time dependent volatility. 


c. assess the efficacy of time-dependent volatility models. 
d. 
e. 


describe the short-term rate process under the Cox-Ingersoll-Ross (CIR) and lognormal models. 
calculate the short-term rate change and describe the basis point volatility using the CIR and 
lognormal models. 
describe lognormal models with deterministic drift and mean reversion. 


15. Volatility Smiles 
John C. Hull, Options, Futures, and Other Derivatives, 10th Edition (New York, NY: Pearson, 
2017). Chapter 20. 


After completing this reading, you should be able to: 


a. 
b. 
c. 
d. 


e. 


f. 


g. 
h. 


i. 


describe a volatility smile and volatility skew. 
explain the implications of put-call parity on the implied volatility of call and put options. 
compare the shape of the volatility smile (or skew) to the shape of the implied distribution of the 
underlying asset price and to the pricing of options on the underlying asset. 
describe characteristics of foreign exchange rate distributions and their implications on option 
prices and implied volatility. 
describe the volatility smile for equity options and foreign currency options and provide possible 
explanations for its shape. 
describe alternative ways of characterizing the volatility smile. 
describe volatility term structures and volatility surfaces and how they may be used to price 
options. 
explain the impact of the volatility smile on the calculation of an option’s Greek-letter risk 
measures. 
explain the impact of a single asset price jump on a volatility smile. 


16. Fundamental Review of the Trading Book 
John C. Hull, Risk Management and Financial Institutions, 5th Edition (Hoboken, NJ: John Wiley 
& Sons, 2018). Chapter 18. 


After completing this reading, you should be able to: 


a. 


b. 


describe the changes to the Basel framework for calculating market risk capital under the 
Fundamental Review of the Trading Book (FRTB) and the motivations for these changes. 
compare the various liquidity horizons proposed by the FRTB for different asset classes and 
explain how a bank can calculate its expected shortfall using the various horizons. 
explain the FRTB revisions to Basel regulations in the following areas: 

= classification of positions in the trading book compared to the banking book. 


= backtesting, profit and loss attribution, credit risk, and securitizations. 


The following is a review of the Market Risk Measurement and Management principles designed to address the 
learning objectives set forth by GARP®. Cross-reference to GARP assigned reading—Dowd, Chapter 3. 


READING 1 


ESTIMATING MARKET RISK 
MEASURES: AN INTRODUCTION 
AND OVERVIEW 


Study Session 1 


EXAM FOCUS 


In this reading, the focus is on the estimation of market risk measures, such as value at 
risk (VaR). VaR identifies the probability that losses will be greater than a pre-specified 
threshold level. For the exam, be prepared to evaluate and calculate VaR using historical 
simulation and parametric models (both normal and lognormal return distributions). 
One drawback to VaR is that it does not estimate losses in the tail of the returns 
distribution. Expected shortfall (ES) does, however, estimate the loss in the tail (i.e. 
after the VaR threshold has been breached) by averaging loss levels at different 
confidence levels. Coherent risk measures incorporate personal risk aversion across the 
entire distribution and are more general than expected shortfall. Quantile-quantile (QQ) 
plots are used to visually inspect if an empirical distribution matches a theoretical 
distribution. 


ESTIMATING RETURNS 


To better understand the material in this reading, it is helpful to recall the 
computations of arithmetic and geometric returns. Note that the convention when 
computing these returns (as well as VaR) is to quote return losses as positive values. 
For example, if a portfolio is expected to decrease in value by $1 million, we use the 
terminology “expected loss is $1 million” rather than “expected profit is -$1 million.” 


Profit/loss data: Change in value of asset/portfolio, P,, at the end of period t plus any 
interim payments, D, 

P/L,=P,+D,—P,, 
Arithmetic return data: Assumption is that interim payments do not earn a return 


(i.e. no reinvestment). Hence, this approach is not appropriate for long investment 
horizons. 


Geometric return data: Assumption is that interim payments are continuously 
reinvested. Note that this approach ensures that asset price can never be negative. 


= 


r= ma( 


t-i 


MODULE 1.1: HISTORICAL AND PARAMETRIC 
ESTIMATION APPROACHES 


Historical Simulation Approach 


LO 1.a: Estimate VaR using a historical simulation approach. 


Estimating VaR with a historical simulation approach is by far the simplest and most 
straightforward VaR method. To make this calculation, you simply order return 
observations from largest to smallest. The observation that follows the threshold loss 
level denotes the VaR limit. We are essentially searching for the observation that 
separates the tail from the body of the distribution. More generally, the observation 
that determines VaR for n observations at the (1 - a) confidence level would be: (a x n) 
+1. 


PROFESSOR’S NOTE 
ê Recall that the confidence level, (1 - a), is typically a large value (e.g., 95%) 


whereas the significance level, usually denoted as a, is much smaller (e.g, 
5%). 


To illustrate this VaR method, assume you have gathered 1,000 monthly returns for a 
security and produced the distribution shown in Figure 1.1. You decide that you want to 
compute the monthly VaR for this security at a confidence level of 95%. At a 95% 
confidence level, the lower tail displays the lowest 5% of the underlying distributions 
returns. For this distribution, the value associated with a 95% confidence level is a 
return of -15.5%. If you have $1,000,000 invested in this security, the one-month VaR is 
$155,000 (= - 15.5% x $1,000,000). 


Figure 1.1: Histogram of Monthly Returns 
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Monthly Return 


EXAMPLE: Identifying the VaR limit 


Identify the ordered observation in a sample of 1,000 data points that corresponds 
to VaR at a 95% confidence level. 


Answer: 


Since VaR is to be estimated at 95% confidence, this means that 5% (i.e., 50) of the 
ordered observations would fall in the tail of the distribution. Therefore, the 51st 
ordered loss observation would separate the 5% of largest losses from the 
remaining 95% of returns. 


LT PROFESSOR’S NOTE 

s VaR is the quantile that separates the tail from the body of the distribution. 
With 1,000 observations at a 95% confidence level, there is a certain level of 
arbitrariness in how the ordered observations relate to VaR. In other words, 
should VaR be the 50th observation (i.e., a x n), the 51st observation [i.e., (a x 
n) + 1], or some combination of these observations? In this example, using the 
51st observation was the approximation for VaR, and the method used in the 
assigned reading. However, on past FRM exams, VaR using the historical 
simulation method has been calculated as just: (a x n), in this case, as the 50th 
observation. 


EXAMPLE: Computing VaR 


A long history of profit/loss data closely approximates a standard normal 
distribution (mean equals zero; standard deviation equals one). Estimate the 5% 
VaR using the historical simulation approach. 


Answer: 


The VaR limit will be at the observation that separates the tail loss with area equal 
to 5% from the remainder of the distribution. Since the distribution is closely 
approximated by the standard normal distribution, the VaR is 1.65 (5% critical 
value from the z-table). Recall that since VaR is a one-tailed test, the entire 
significance level of 5% is in the left tail of the returns distribution. 


From a practical perspective, the historical simulation approach is sensible only if you 
expect future performance to follow the same return generating process as in the past. 
Furthermore, this approach is unable to adjust for changing economic conditions or 
abrupt shifts in parameter values. 


Parametric Estimation Approaches 


LO 1.b: Estimate VaR using a parametric approach for both normal and 
lognormal return distributions. 


In contrast to the historical simulation method, the parametric approach (e.g., the 
delta-normal approach) explicitly assumes a distribution for the underlying 
observations. In this section, we will analyze two cases: (1) VaR for returns that follow 
a normal distribution, and (2) VaR for returns that follow a lognormal distribution. 


Normal VaR 


Intuitively, the VaR for a given confidence level denotes the point that separates the tail 
losses from the remaining distribution. The VaR cutoff will be in the left tail of the 
returns distribution. Hence, the calculated value at risk is negative, but is typically 
reported as a positive value since the negative amount is implied (i.e., it is the value 
that is at risk). In equation form, the VaR at significance level a is: 


VaR(a%) = Up + Spyz Za 


where u and o denote the mean and standard deviation of the profit/loss distribution 
and z denotes the critical value (i.e., quantile) of the standard normal. In practice, the 
population parameters yu and o are not likely known, in which case the researcher will 
use the sample mean and standard deviation. 


EXAMPLE: Computing VaR (normal distribution) 


Assume that the profit/loss distribution for XYZ is normally distributed with an 
annual mean of $15 million and a standard deviation of $10 million. Calculate the 
VaR at the 95% and 99% confidence levels using a parametric approach. 


Answer: 


VaR(5%) = -$15 million + $10 million x 1.65 = $1.5 million. Therefore, XYZ expects 
to lose at most $1.5 million over the next year with 95% confidence. Equivalently, 
XYZ expects to lose more than $1.5 million with a 5% probability. 


VaR(1%) = -$15 million + $10 million x 2.33 = $8.3 million. Note that the VaR (at 
99% confidence) is greater than the VaR (at 95% confidence) as follows from the 
definition of value at risk. 


Now suppose that the data you are using is arithmetic return data rather than 
profit/loss data. The arithmetic returns follow a normal distribution as well. As you 
would expect, because of the relationship between prices, profits/losses, and returns, 
the corresponding VaR is very similar in format: 

VaR(a%) = (—H, + 0, X Z,) X Phy 


EXAMPLE: Computing VaR (arithmetic returns) 


A portfolio has a beginning period value of $100. The arithmetic returns follow a 
normal distribution with a mean of 10% and a standard deviation of 20%. 
Calculate VaR at both the 95% and 99% confidence levels. 


Answer: 


VaR(5%) = (—10% + 1.65 x 20%) x 100 = $23.0 
VaR(1%) = (—10% + 2.33 x 20%) x 100 = $36.6 


Lognormal VaR 


The lognormal distribution is right-skewed with positive outliers and bounded below 

by zero. As a result, the lognormal distribution is commonly used to counter the 

possibility of negative asset prices (P,). Technically, if we assume that geometric 

returns follow a normal distribution (up, Op), then the natural logarithm of asset prices 

follows a normal distribution and P, follows a lognormal distribution. After some 

algebraic manipulation, we can derive the following expression for lognormal VaR: 
VaR(a%) = P, , * (1 — etrr.) 


EXAMPLE: Computing VaR (lognormal distribution) 


A diversified portfolio exhibits a normally distributed geometric return with mean 
and standard deviation of 10% and 20%, respectively. Calculate the 5% and 1% 
lognormal VaR assuming the beginning period portfolio value is $100. 


Answer: 
Lognormal VaR(5%) = 100 x (1 — exp[0.1 — 0.2 x 1.65]) 
= 100 x (1 — exp[—0.23]) 
= $20.55 


Lognormal VaR(1%) = 100 x (1 — exp[0.1 — 0.2 x 2.33]) 
= 100 x (1 — exp[—0.366]) 
= $30.65 


Note that the calculation of lognormal VaR (geometric returns) and normal VaR 
(arithmetic returns) will be similar when we are dealing with short time periods and 
practical return estimates. 


2) MODULE QUIZ 1.1 
— 1. The VaR at a 95% confidence level is estimated to be 1.56 from a historical simulation 
of 1,000 observations. Which of the following statements is most likely true? 
A. The parametric assumption of normal returns is correct. 
B. The parametric assumption of lognormal returns is correct. 
C. The historical distribution has fatter tails than a normal distribution. 
D. The historical distribution has thinner tails than a normal distribution. 


2. Assume the profit/loss distribution for XYZ is normally distributed with an annual 
mean of $20 million and a standard deviation of $10 million. The 5% VaR is calculated 
and interpreted as which of the following statements? 

A. 5% probability of losses of at least $3.50 million. 

B. 5% probability of earnings of at least $3.50 million. 
C. 95% probability of losses of at least $3.50 million. 

D. 95% probability of earnings of at least $3.50 million. 


MODULE 1.2: RISK MEASURES 
Expected Shortfall 


LO 1.c: Estimate the expected shortfall given profit and loss (P&L) or return data. 


A major limitation of the VaR measure is that it does not tell the investor the amount 
or magnitude of the actual loss. VaR only provides the maximum value we can lose for a 
given confidence level. The expected shortfall (ES) provides an estimate of the tail 
loss by averaging the VaRs for increasing confidence levels in the tail. Specifically, the 
tail mass is divided into n equal slices and the corresponding n - 1 VaRs are computed. 
For example, if n = 5, we can construct the following table based on the normal 
distribution: 


Figure 1.2: Estimating Expected Shortfall 


Confidence Level VaR Difference 
96% 1.7507 

97% 1.8808 0.1301 
98% 2.0537 0.1729 
99% 2.3263 0.2726 
Average 2.003 

Theoretical true value 2.063 


Observe that the VaR increases (from Difference column) in order to maintain the same 
interval mass (of 1%) because the tails become thinner and thinner. The average of the 
four computed VaRs is 2.003 and represents the probability-weighted expected tail loss 
(a.k.a. expected shortfall). Note that as n increases, the expected shortfall will increase 


and approach the theoretical true loss [2.063 in this case; the average of a high number 
of VaRs (e.g., greater than 10,000)]. 


Estimating Coherent Risk Measures 
LO 1.d: Estimate risk measures by estimating quantiles. 


A more general risk measure than either VaR or ES is known as a coherent risk 
measure. A coherent risk measure is a weighted average of the quantiles of the loss 
distribution where the weights are user-specific based on individual risk aversion. ES 
(as well as VaR) is a special case of a coherent risk measure. When modeling the ES 
case, the weighting function is set to [1 / (1 - confidence level)] for all tail losses. All 
other quantiles will have a weight of zero. 


Under expected shortfall estimation, the tail region is divided into equal probability 
slices and then multiplied by the corresponding quantiles. Under the more general 
coherent risk measure, the entire distribution is divided into equal probability slices 
weighted by the more general risk aversion (weighting) function. 


This procedure is illustrated for n = 10. First, the entire return distribution is divided 
into nine (i.e., n - 1) equal probability mass slices at 10%, 20%, ..., 90% (i.e., loss 
quantiles). Each breakpoint corresponds to a different quantile. For example, the 10% 
quantile (confidence level = 10%) relates to -1.2816, the 20% quantile (confidence 
level = 20%) relates to -0.8416, and the 90% quantile (confidence level = 90%) relates 
to 1.2816. Next, each quantile is weighted by the specific risk aversion function and 
then averaged to arrive at the value of the coherent risk measure. 


This coherent risk measure is more Sensitive to the choice of n than expected shortfall, 
but will converge to the risk measure’s true value for a sufficiently large number of 
observations. The intuition is that as n increases, the quantiles will be further into the 
tails where more extreme values of the distribution are located. 


Even though the risk measure estimate eventually converges to the true value as the 
number of observations is sufficiently large, knowing the exact value of n can be useful. 
One approach involves beginning with a small value of n and repeatedly doubling it 
until the risk measure estimates stabilize. Every time the number of observations is 
doubled, the width of the tail slides is cut in half. This process allows for the calculation 
of the “halving error,’ and the ideal number of tail slides is found when the halving error 
is near zero (i.e., the difference between the estimated risk measures as n increases is 
minimal). 


LO 1.e: Evaluate estimators of risk measures by estimating their standard errors. 


Sound risk management practice reminds us that estimators are only as useful as their 
precision. That is, estimators that are less precise (i.e., have large standard errors and 
wide confidence intervals) will have limited practical value. Therefore, it is best 
practice to also compute the standard error for all coherent risk measures. 


PROFESSOR’S NOTE 

ê The process of estimating standard errors for estimators of coherent risk 
measures is quite complex, so your focus should be on interpretation of this 
concept. 


First, let’s start with a sample size of n and arbitrary bin width of h around quantile, q. 
Bin width is just the width of the intervals, sometimes called “bins,” in a histogram. 
Computing standard error is done by realizing that the square root of the variance of 
the quantile is equal to the standard error of the quantile. After finding the standard 
error, a confidence interval for a risk measure such as VaR can be constructed as 
follows: 


[q + se(q) x z,] > VaR > [q — se(q) x z] 


EXAMPLE: Estimating standard errors 


Construct a 90% confidence interval for 5% VaR (the 95% quantile) drawn from a 
standard normal distribution. Assume bin width = 0.1 and that the sample size is 
equal to 500. 


Answer: 


The quantile value, q, corresponds to the 5% VaR which occurs at 1.65 for the 
standard normal distribution. The confidence interval takes the following form: 


[1.65 + 1.65 x se(q)] > VaR > [1.65 — 1.65 x se(q)] 


PROFESSOR’S NOTE 

Recall that a confidence interval is a two-tailed test (unlike VaR), so a 
90% confidence level will have 5% in each tail. Given that this is 
equivalent to the 5% significance level of VaR, the critical values of 1.65 
will be the same in both cases. 


Since bin width is 0.1, q is in the range 1.65 + 0.1/2 = [1.7, 1.6]. Note that the left tail 
probability, p, is the area to the left of -1.7 for a standard normal distribution. 


Next, calculate the probability mass between [1.7, 1.6], represented as f(q). From 
the standard normal table, the probability of a loss greater than 1.7 is 0.045 (left 
tail). Similarly, the probability of a loss less than 1.6 (right tail) is 0.945. 
Collectively, f(q) = 1 - 0.045 - 0.945 = 0.01. 


The standard error of the quantile is derived from the variance approximation of q 
and is equal to: 
Vp — p)/ an 

fiq) 


se{q) = 


Now we are ready to substitute in the variance approximation to calculate the 
confidence interval for VaR: 


(0.0451 — 0.045) 7 500 
[ss KUEN | 


0.01 
> VaR > 
\/0.045(1 — 0.045) / 500 
1.65 — 1.65 001 = 3.18 > VaR > 0.1 


Let’s return to the variance approximation and perform some basic comparative 
statistics. What happens if we increase the sample size holding all other factors 
constant? Intuitively, the larger the sample size the smaller the standard error and the 
narrower the confidence interval. 


Now suppose we increase the bin size, h, holding all else constant. This will increase the 
probability mass f(q) and reduce p, the probability in the left tail. The standard error 
will decrease and the confidence interval will again narrow. 


Lastly, suppose that p increases indicating that tail probabilities are more likely. 
Intuitively, the estimator becomes less precise and standard errors increase, which 
widens the confidence interval. Note that the expression p(1 - p) will be maximized at 
p= 0.5. 


The above analysis was based on one quantile of the loss distribution. Just as the 
previous section generalized the expected shortfall to the coherent risk measure, we 
can do the same for the standard error computation. Thankfully, this complex process is 
not the focus of the LO. 


Quantile-Quantile Plots 


LO 1.f: Interpret quantile-quantile (QQ) plots to identify the characteristics of a 
distribution. 


A natural question to ask in the course of our analysis is, “From what distribution is the 
data drawn?” The truth is that you will never really know since you only observe the 
realizations from random draws of an unknown distribution. However, visual 
inspection can be a very simple but powerful technique. 


In particular, the quantile-quantile (QQ) plot is a straightforward way to visually 
examine if empirical data fits the reference or hypothesized theoretical distribution 
(assume standard normal distribution for this discussion). The process graphs the 
quantiles at regular confidence intervals for the empirical distribution against the 
theoretical distribution. As an example, if both the empirical and theoretical data are 
drawn from the same distribution, then the median (confidence level = 50%) of the 
empirical distribution would plot very close to zero, while the median of the 
theoretical distribution would plot exactly at zero. 


Continuing in this fashion for other quantiles (40%, 60%, and so on) will map out a 
function. If the two distributions are very similar, the resulting QQ plot will be linear. 


Let us compare a theoretical standard normal distribution relative to an empirical t- 
distribution (assume that the degrees of freedom for the t-distribution are sufficiently 
small and that there are noticeable differences from the normal distribution). We know 
that both distributions are symmetric, but the ¢-distribution will have fatter tails. 
Hence, the quantiles near zero (confidence level = 50%) will match up quite closely. As 
we move further into the tails, the quantiles between the t-distribution and the normal 
will diverge (see Figure 1.3). For example, at a confidence level of 95%, the critical z- 
value is -1.65, but for the t-distribution, it is closer to -1.68 (degrees of freedom of 
approximately 40). At 97.5% confidence, the difference is even larger, as the z-value is 
equal to -1.96 and the t-stat is equal to -2.02. More generally, if the middles of the QQ 
plot match up, but the tails do not, then the empirical distribution can be interpreted as 
symmetric with tails that differ from a normal distribution (either fatter or thinner). 


Figure 1.3: QQ Plot 
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=) MODULE QUIZ 1.2 
= 1. Which of the following statements about expected shortfall estimates and coherent 
risk measures are true? 
A. Expected shortfall and coherent risk measures estimate quantiles for the entire 
loss distribution. 
B. Expected shortfall and coherent risk measures estimate quantiles for the tail 
region. 
C. Expected shortfall estimates quantiles for the tail region and coherent risk 
measures estimate quantiles for the non-tail region only. 
D. Expected shortfall estimates quantiles for the entire distribution and coherent risk 
measures estimate quantiles for the tail region only. 


2. Which of the following statements most likely increases standard errors from coherent 
risk measures? 
A. Increasing sample size and increasing the left tail probability. 
B. Increasing sample size and decreasing the left tail probability. 
C. Decreasing sample size and increasing the left tail probability. 


D. Decreasing sample size and decreasing the left tail probability. 
3. The quantile-quantile plot is best used for what purpose? 
A. Testing an empirical distribution from a theoretical distribution. 
B. Testing a theoretical distribution from an empirical distribution. 
C. Identifying an empirical distribution from a theoretical distribution. 
D. Identifying a theoretical distribution from an empirical distribution. 


KEY CONCEPTS 


LO 1.a 


Historical simulation is the easiest method to estimate value at risk. All that is required 
is to reorder the profit/loss observations in increasing magnitude of losses and identify 
the breakpoint between the tail region and the remainder of distribution. 


LO 1.b 


Parametric estimation of VaR requires a specific distribution of prices or equivalently, 
returns. This method can be used to calculate VaR with either a normal distribution or 
a lognormal distribution. 


Under the assumption of a normal distribution, VaR (i.e., delta-normal VaR) is 
calculated as follows: 


7; — 
VaR = —tpy + Spy, * Za 


Under the assumption of a lognormal distribution, lognormal VaR is calculated as 
follows: 


LO 1.c 


VaR identifies the lower bound of the profit/loss distribution, but it does not estimate 
the expected tail loss. Expected shortfall overcomes this deficiency by dividing the tail 
region into equal probability mass slices and averaging their corresponding VaRs. 


LO 1.d 


A more general risk measure than either VaR or ES is known as a coherent risk 
measure. A coherent risk measure is a weighted average of the quantiles of the loss 
distribution where the weights are user-specific based on individual risk aversion. A 
coherent risk measure will assign each quantile (not just tail quantiles) a weight. The 
average of the weighted VaRs is the estimated loss. 


LO l.e 


Sound risk management requires the computation of the standard error of a coherent 
risk measure to estimate the precision of the risk measure itself. The simplest method 
creates a confidence interval around the quantile in question. To compute standard 
error, it is necessary to find the variance of the quantile, which will require estimates 
from the underlying distribution. 


LO 1.f 


The quantile-quantile (QQ) plot is a visual inspection of an empirical quantile relative 
to a hypothesized theoretical distribution. If the empirical distribution closely matches 
the theoretical distribution, the QQ plot would be linear. 


ANSWER KEY FOR MODULE QUIZZES 


Module Quiz 1.1 


1.D The historical simulation indicates that the 5% tail loss begins at 1.56, which is 
less than the 1.65 predicted by a standard normal distribution. Therefore, the 
historical simulation has thinner tails than a standard normal distribution. (LO 
1.a) 


2. D The value at risk calculation at 95% confidence is: -20 million + 1.65 x 10 million 
= -$3.50 million. Since the expected loss is negative and VaR is an implied 
negative amount, the interpretation is that XYZ will earn less than +$3.50 million 
with 5% probability, which is equivalent to XYZ earning at least $3.50 million 
with 95% probability. (LO 1.b) 


Module Quiz 1.2 


1.B ES estimates quantiles for n - 1 equal probability masses in the tail region only. 
The coherent risk measure estimates quantiles for the entire distribution 
including the tail region. (LO 1.c) 


2.C Decreasing sample size clearly increases the standard error of the coherent risk 
measure given that standard error is defined as: 


_ Np(l—p)/a 
= fq) 

As the left tail probability, p, increases, the probability of tail events increases, 

which also increases the standard error. Mathematically, p(1 - p) increases as p 


increases until p = 0.5. Small values of p imply smaller standard errors. (LO 1.e) 


3.C Once a sample is obtained, it can be compared to a reference distribution for 
possible identification. The QQ plot maps the quantiles one to one. If the 
relationship is close to linear, then a match for the empirical distribution is found. 
The QQ plot is used for visual inspection only without any formal statistical test. 
(LO 1.f) 


The following is a review of the Market Risk Measurement and Management principles designed to address the 
learning objectives set forth by GARP®. Cross-reference to GARP assigned reading—Dowd, Chapter 4. 


READING 2 
NON-PARAMETRIC APPROACHES 


Study Session 1 


EXAM FOCUS 


This reading introduces non-parametric estimation and bootstrapping (i.e., resampling). 
The key difference between these approaches and parametric approaches discussed in 
the previous reading is that with non-parametric approaches the underlying 
distribution is not specified, and it is a data driven, not assumption driven, analysis. For 
example, historical simulation is limited by the discreteness of the data, but non- 
parametric analysis “smooths” the data points to allow for any VaR confidence level 
between observations. For the exam, pay close attention to the description of the 
bootstrap historical simulation approach as well as the various weighted historical 
simulations approaches. 


MODULE 2.1: NON-PARAMETRIC APPROACHES 


Non-parametric estimation does not make restrictive assumptions about the 
underlying distribution like parametric methods, which assume very specific forms 
such as normal or lognormal distributions. Non-parametric estimation lets the data 
drive the estimation. The flexibility of these methods makes them excellent candidates 
for VaR estimation, especially if tail events are sparse. 


Bootstrap Historical Simulation Approach 


LO 2.a: Apply the bootstrap historical simulation approach to estimate coherent 
risk measures. 


The bootstrap historical simulation is a simple and intuitive estimation procedure. In 
essence, the bootstrap technique draws a sample from the original data set, records the 
VaR from that particular sample and “returns” the data. This procedure is repeated over 
and over and records multiple sample VaRs. Since the data is always “returned” to the 
data set, this procedure is akin to sampling with replacement. The best VaR estimate 
from the full data set is the average of all sample VaRs. 


This same procedure can be performed to estimate the expected shortfall (ES). Each 
drawn sample will calculate its own ES by slicing the tail region into n slices and 
averaging the VaRs at each of the n - 1 quantiles. This is exactly the same procedure 
described in the previous reading. Similarly, the best estimate of the expected shortfall 
for the original data set is the average of all of the sample expected shortfalls. 


Empirical analysis demonstrates that the bootstrapping technique consistently 
provides more precise estimates of coherent risk measures than historical simulation 
on raw data alone. 


Applying Non-Parametric Estimation 
LO 2.b: Describe historical simulation using non-parametric density estimation. 


The clear advantage of the traditional historical simulation approach is its simplicity. 
One obvious drawback, however, is that the discreteness of the data does not allow for 
estimation of VaRs between data points. If there were 100 historical observations, then 
it is straightforward to estimate VaR at the 95% or the 96% confidence levels, and so 
on. However, this method is unable to incorporate a confidence level of 95.5%, for 
example. More generally, with n observations, the historical simulation method only 
allows for n different confidence levels. 


One of the advantages of non-parametric density estimation is that the underlying 
distribution is free from restrictive assumptions. Therefore, the existing data points can 
be used to “smooth” the data points to allow for VaR calculation at all confidence levels. 
The simplest adjustment is to connect the midpoints between successive histogram 
bars in the original data set’s distribution. See Figure 2.1 for an illustration of this 
surrogate density function. Notice that by connecting the midpoints, the lower bar 
“receives” area from the upper bar, which “loses” an equal amount of area. In total, no 
area is lost, only displaced, so we still have a probability distribution function, just with 
a modified shape. The shaded area in Figure 2.1 represents a possible confidence 
interval, which can be utilized regardless of the size of the data set. The major 
improvement of this non-parametric approach over the traditional historical 
simulation approach is that VaR can now be calculated for a continuum of points in the 
data set. 


Figure 2.1: Surrogate Density Function 
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Following this logic, one can see that the linear adjustment is a simple solution to the 
interval problem. A more complicated adjustment would involve connecting curves, 
rather than lines, between successive bars to better capture the characteristics of the 
data. 


Weighted Historical Simulation Approaches 


LO 2.c: Compare and contrast the age-weighted, the volatility-weighted, the 
correlation-weighted, and the filtered historical simulation approaches. 


The previous weighted historical simulation, discussed in Reading 1, assumed that both 
current and past (arbitrary) n observations up to a specified cutoff point are used when 
computing the current period VaR. Older observations beyond the cutoff date are 
assumed to have a zero weight and the relevant n observations have equal weight of (1 
/ n). While simple in construction, there are obvious problems with this method. 
Namely, why is the nth observation as important as all other observations, but the (n + 
1)th observation is so unimportant that it carries no weight? Current VaR may have 
“ghost effects” of previous events that remain in the computation until they disappear 
(after n periods). Furthermore, this method assumes that each observation is 
independent and identically distributed. This is a very strong assumption, which is 
likely violated by data with clear seasonality (i.e., seasonal volatility). This reading 
identifies four improvements to the traditional historical simulation method. 


Age-Weighted Historical Simulation 
The obvious adjustment to the equal-weighted assumption used in historical 
simulation is to weight recent observations more and distant observations less. One 


method proposed by Boudoukh, Richardson, and Whitelaw is as follows.! Assume w(1) 
is the probability weight for the observation that is one day old. Then w(2) can be 


defined as Aw(1), w(3) can be defined as A?w(1), and so on. The decay parameter, À, can 


take on values 0 <A < 1 where values close to 1 indicate slow decay. Since all of the 
weights must sum to 1, we conclude that w(1) = (1 - A) / (1 - A”). More generally, the 
weight for an observation that is i days old is equal to: 

11 — d) 


ei 


The implication of the age-weighted simulation is to reduce the impact of ghost effects 
and older events that may not reoccur. Note that this more general weighting scheme 
suggests that historical simulation is a special case where A = 1 (i.e., no decay) over the 
estimation window. 


Si PROFESSOR’S NOTE 
This approach is also known as the hybrid approach. 


Volatility-Weighted Historical Simulation 


Another approach is to weight the individual observations by volatility rather than 
proximity to the current date. This was introduced by Hull and White to incorporate 
changing volatility in risk estimation.” The intuition is that if recent volatility has 
increased, then using historical data will underestimate the current risk level. Similarly, 
if current volatility is markedly reduced, the impact of older data with higher periods 
of volatility will overstate the current risk level. 


This process is captured in the expression here for estimating VaR on day T. The 
expression is achieved by adjusting each daily return, r,; on day t upward or downward 
based on the then-current volatility forecast, o,; (estimated from a GARCH or EWMA 
model) relative to the current volatility forecast on day T. 


where: 
t,; = actual return for asset i on day t 


o; = volatility forecast for asset i on day t (made at the end of day t — 1) 


oq; = current forecast of volatility for asset i 


Thus, the volatility-adjusted return, a is replaced with a larger (smaller) expression if 
current volatility exceeds (is below) historical volatility on day i. Now, VaR, ES, and any 
other coherent risk measure can be calculated in the usual way after substituting 
historical returns with volatility-adjusted returns. 


There are several advantages of the volatility-weighted method. First, it explicitly 
incorporates volatility into the estimation procedure in contrast to other historical 
methods. Second, the near-term VaR estimates are likely to be more sensible in light of 
current market conditions. Third, the volatility-adjusted returns allow for VaR 
estimates that are higher than estimates with the historical data set. 


Correlation-Weighted Historical Simulation 


As the name suggests, this methodology incorporates updated correlations between 
asset pairs. This procedure is more complicated than the volatility-weighting approach, 


but it follows the same basic principles. Since the corresponding LO does not require 
calculations, the exact matrix algebra would only complicate our discussion. Intuitively, 
the historical correlation (or equivalently variance-covariance) matrix needs to be 
adjusted to the new information environment. This is accomplished, loosely speaking, 
by “multiplying” the historic returns by the revised correlation matrix to yield updated 
correlation-adjusted returns. 


Let us look at the variance-covariance matrix more closely. In particular, we are 
concerned with diagonal elements and the off-diagonal elements. The off-diagonal 
elements represent the current covariance between asset pairs. On the other hand, the 
diagonal elements represent the updated variances (covariance of the asset return with 
itself) of the individual assets. 
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Notice that updated variances were utilized in the previous approach as well. Thus, 
correlation-weighted simulation is an even richer analytical tool than volatility- 
weighted simulation because it allows for updated variances (volatilities) as well as 
covariances (correlations). 


Filtered Historical Simulation 


The filtered historical simulation is the most comprehensive, and hence most 
complicated, of the non-parametric estimators. The process combines the historical 
simulation model with conditional volatility models (like GARCH or asymmetric 
GARCH). Thus, the method contains both the attractions of the traditional historical 
simulation approach with the sophistication of models that incorporate changing 
volatility. In simplified terms, the model is flexible enough to capture conditional 
volatility and volatility clustering as well as a surprise factor that could have an 
asymmetric effect on volatility. 


The model will forecast volatility for each day in the sample period and the volatility 
will be standardized by dividing by realized returns. Bootstrapping is used to simulate 
returns which incorporate the current volatility level. Finally, the VaR is identified from 
the simulated distribution. The methodology can be extended over longer holding 
periods or for multi-asset portfolios. 


In sum, the filtered historical simulation method uses bootstrapping and combines the 
traditional historical simulation approach with rich volatility modeling. The results are 
then sensitive to changing market conditions and can predict losses outside the 
historical range. From a computational standpoint, this method is very reasonable even 
for large portfolios, and empirical evidence supports its predictive ability. 


Advantages and Disadvantages of Non-Parametric 
Methods 


LO 2.d: Identify advantages and disadvantages of non-parametric estimation 
methods. 


Any risk manager should be prepared to use non-parametric estimation techniques. 
There are some clear advantages to non-parametric methods, but there is some danger 
as well. Therefore, it is incumbent to understand the advantages, the disadvantages, and 
the appropriateness of the methodology for analysis. 


Advantages of non-parametric methods include the following: 

= Intuitive and often computationally simple (even on a spreadsheet). 

= Not hindered by parametric violations of skewness, fat tails, et cetera. 

a Avoids complex variance-covariance matrices and dimension problems. 


= Data is often readily available and does not require adjustments (e.g,, financial 
statements adjustments). 


= Can accommodate more complex analysis (e.g. by incorporating age-weighting with 
volatility-weighting). 


Disadvantages of non-parametric methods include the following: 

a Analysis depends critically on historical data. 

= Volatile data periods lead to VaR and ES estimates that are too high. 
= Quiet data periods lead to VaR and ES estimates that are too low. 

= Difficult to detect structural shifts/regime changes in the data. 


= Cannot accommodate plausible large impact events if they did not occur within the 
sample period. 


= Difficult to estimate losses significantly larger than the maximum loss within the 
data set (historical simulation cannot; volatility-weighting can, to some degree). 


= Need sufficient data, which may not be possible for new instruments or markets. 


©) MODULE QUIZ 2.1 


1. Johanna Roberto has collected a data set of 1,000 daily observations on equity returns. 
She is concerned about the appropriateness of using parametric techniques as the 
data appears skewed. Ultimately, she decides to use historical simulation and 
bootstrapping to estimate the 5% VaR. Which of the following steps is most likely to 
be part of the estimation procedure? 

A. Filter the data to remove the obvious outliers. 

B. Repeated sampling with replacement. 

C. Identify the tail region from reordering the original data. 

D. Apply a weighting procedure to reduce the impact of older data. 


2. All of the following approaches improve the traditional historical simulation approach 
for estimating VaR except the: 
A. volatility-weighted historical simulation. 
B. age-weighted historical simulation. 


C. market-weighted historical simulation. 
D. correlation-weighted historical simulation. 
3. Which of the following statements about age-weighting is most accurate? 

A. The age-weighting procedure incorporates estimates from GARCH models. 

B. If the decay factor in the model is close to 1, there is persistence within the data 
set. 

C. When using this approach, the weight assigned on day i is equal to w(i)= At !x(1-A) 
/ (1-4). 

D. The number of observations should at least exceed 250. 


4. Which of the following statements about volatility-weighting is true? 
A. Historic returns are adjusted, and the VaR calculation is more complicated. 
B. Historic returns are adjusted, and the VaR calculation procedure is the same. 
C. Current period returns are adjusted, and the VaR calculation is more complicated. 
D. Current period returns are adjusted, and the VaR calculation is the same. 


5. All of the following items are generally considered advantages of non-parametric 
estimation methods except: 
A. ability to accommodate skewed data. 
B. availability of data. 
C. use of historical data. 
D. little or no reliance on covariance matrices. 


KEY CONCEPTS 


LO 2.a 


Bootstrapping involves resampling a subset of the original data set with replacement. 
Each draw (subsample) yields a coherent risk measure (VaR or ES). The average of the 
risk measures across all samples is then the best estimate. 


LO 2.b 


The discreteness of historical data reduces the number of possible VaR estimates since 
historical simulation cannot adjust for significance levels between ordered 
observations. However, non-parametric density estimation allows the original 
histogram to be modified to fill in these gaps. The process connects the midpoints 
between successive columns in the histogram. The area is then “removed” from the 
upper bar and “placed” in the lower bar, which creates a “smooth” function between the 
original data points. 


LO 2.c 

One important limitation to the historical simulation method is the equal weight 
assumed for all data in the estimation period, and zero weight otherwise. This arbitrary 
methodology can be improved by using age-weighted simulation, volatility-weighted 
simulation, correlation-weighted simulation, and filtered historical simulation. 


The age-weighted simulation method adjusts the most recent (distant) observations to 
be more (less) heavily weighted. 


The volatility-weighting procedure incorporates the possibility that volatility may 
change over the estimation period, which may understate or overstate current risk by 
including stale data. The procedure replaces historic returns with volatility-adjusted 
returns; however, the actual procedure of estimating VaR is unchanged (i.e., only the 
data inputs change). 


Correlation-weighted simulation updates the variance-covariance matrix between the 
assets in the portfolio. The off-diagonal elements represent the covariance pairs while 
the diagonal elements update the individual variance estimates. Therefore, the 
correlation-weighted methodology is more general than the volatility-weighting 
procedure by incorporating both variance and covariance adjustments. 


Filtered historical simulation is the most complex estimation method. The procedure 
relies on bootstrapping of standardized returns based on volatility forecasts. The 
volatility forecasts arise from GARCH or similar models and are able to capture 
conditional volatility, volatility clustering, and/or asymmetry. 


LO 2.d 


Advantages of non-parametric models include: data can be skewed or have fat tails; 
they are conceptually straightforward; there is readily available data; and they can 
accommodate more complex analysis. Disadvantages focus mainly on the use of 
historical data, which limits the VaR forecast to (approximately) the maximum loss in 
the data set; they are slow to respond to changing market conditions; they are affected 
by volatile (quiet) data periods; and they cannot accommodate plausible large losses if 
not in the data set. 


ANSWER KEY FOR MODULE QUIZ 


Module Quiz 2.1 


1.B Bootstrapping from historical simulation involves repeated sampling with 
replacement. The 5% VaR is recorded from each sample draw. The average of the 
VaRs from all the draws is the VaR estimate. The bootstrapping procedure does 
not involve filtering the data or weighting observations. Note that the VaR from 
the original data set is not used in the analysis. (LO 2.a) 


2.C Market-weighted historical simulation is not discussed in this reading. Age- 
weighted historical simulation weights observations higher when they appear 
closer to the event date. Volatility-weighted historical simulation adjusts for 
changing volatility levels in the data. Correlation-weighted historical simulation 
incorporates anticipated changes in correlation between assets in the portfolio. 
(LO 2.c) 


3.B Ifthe intensity parameter (i.e., decay factor) is close to 1, there will be persistence 
(i.e slow decay) in the estimate. The expression for the weight on day i has i in 
the exponent when it should be n. While a large sample size is generally preferred, 
some of the data may no longer be representative in a large sample. (LO 2.c) 


4.B The volatility-weighting method adjusts historic returns for current volatility. 
Specifically, return at time t is multiplied by (current volatility estimate / 
volatility estimate at time t). However, the actual procedure for calculating VaR 
using a historical simulation method is unchanged; it is only the inputted data 
that changes. (LO 2.c) 


5.C The use of historical data in non-parametric analysis is a disadvantage, not an 
advantage. If the estimation period was quiet (volatile) then the estimated risk 
measures may understate (overstate) the current risk level. Generally, the largest 
VaR cannot exceed the largest loss in the historical period. On the other hand, the 
remaining choices are all considered advantages of non-parametric methods. For 
instance, the non-parametric nature of the analysis can accommodate skewed 
data, data points are readily available, and there is no requirement for estimates 
of covariance matrices. (LO 2.d) 


1. Boudoukh, J., M. Richardson, and R. Whitelaw. 1998. “The best of both worlds: a hybrid approach to calculating 
value at risk.” Risk 11: 64-67. 


2. Hull, J., and A. White. 1998. “Incorporating volatility updating into the historical simulation method for value- 
at-risk.” Journal of Risk 1: 5-19. 


The following is a review of the Market Risk Measurement and Management principles designed to address the 
learning objectives set forth by GARP®. Cross-reference to GARP assigned reading—Dowd, Chapter 7. 


READING 3 


PARAMETRIC APPROACHES (II): 
EXTREME VALUE 


Study Session 1 


EXAM FOCUS 


Extreme values are important for risk management because they are associated with 
catastrophic events such as the failure of large institutions and market crashes. Since 
they are rare, modeling such events is a challenging task. In this reading, we will 
address the generalized extreme value (GEV) distribution, and the peaks-over-threshold 
approach, as well as discuss how peaks-over-threshold converges to the generalized 
Pareto distribution. 


MODULE 3.1: EXTREME VALUES 


Managing Extreme Values 


LO 3.a: Explain the importance and challenges of extreme values in risk 
management. 


The occurrence of extreme events is rare; however, it is crucial to identify these 
extreme events for risk management since they can prove to be very costly. Extreme 
values are the result of large market declines or crashes, the failure of major 
institutions, the outbreak of financial or political crises, or natural catastrophes. The 
challenge of analyzing and modeling extreme values is that there are only a few 
observations for which to build a model, and there are ranges of extreme values that 
have yet to occur. 


To meet the challenge, researchers must assume a certain distribution. The assumed 
distribution will probably not be identical to the true distribution; therefore, some 
degree of error will be present. Researchers usually choose distributions based on 
measures of central tendency, which misses the issue of trying to incorporate extreme 
values. Researchers need approaches that specifically deal with extreme value 
estimation. Incidentally, researchers in many fields other than finance face similar 
problems. In flood control, for example, analysts have to model the highest possible 


flood line when building a dam, and this estimation would most likely require a height 
above observed levels of flooding to date. 


Extreme Value Theory 


LO 3.b: Describe extreme value theory (EVT) and its use in risk management. 


Extreme value theory (EVT) is a branch of applied statistics that has been developed 
to address problems associated with extreme outcomes. EVT focuses on the unique 
aspects of extreme values and is different from “central tendency” statistics, in which 
the central limit theorem plays an important role. Extreme value theorems provide a 
template for estimating the parameters used to describe extreme movements. 


One approach for estimating parameters is the Fisher-Tippett theorem. According to 
this theorem, as the sample size n gets large, the distribution of extremes, denoted M,, 
converges to the following distribution known as the generalized extreme value 
(GEV) distribution: 
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For these formulas, the following restriction holds for random variable X: 
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The parameters u and o are the location parameter and scale parameter, respectively, of 

the limiting distribution. Although related to the mean and variance, they are not the 

same. The symbol & is the tail index and indicates the shape (or heaviness) of the tail of 

the limiting distribution. There are three general cases of the GEV distribution: 

1.€ > 0, the GEV becomes a Frechet distribution, and the tails are “heavy” as is the case 
for the t-distribution and Pareto distributions. 


2. € = 0, the GEV becomes the Gumbel distribution, and the tails are “light” as is the case 
for the normal and lognormal distributions. 

3. € < 0, the GEV becomes the Weibull distribution, and the tails are “lighter” than a 
normal distribution. 


Distributions where € < 0 do not often appear in financial models; therefore, financial 

risk management analysis can essentially focus on the first two cases: § > 0 and § = 0. 

Therefore, one practical consideration the researcher faces is whether to assume either 

§ > 0 or € = 0 and apply the respective Frechet or Gumbel distributions and their 

corresponding estimation procedures. There are three basic ways of making this choice. 

1. The researcher is confident of the parent distribution. If the researcher is confident 
it is a t-distribution, for example, then the researcher should assume € > 0. 


2. The researcher applies a statistical test and cannot reject the hypothesis € = 0. In this 
case, the researcher uses the assumption € = 0. 


3. The researcher may wish to be conservative and assume € > 0 to avoid model risk. 


Estimating EV parameters (i.e., u, o, and §) can be done by applying the maximum 
likelihood (ML) method, the regression method, or the semi-parametric method. The 
ML approach involves maximizing a complex likelihood or log-likelihood function. The 
regression approach is easier to apply than ML and consists of ordering extreme values 
from lowest to highest. Semi-parametric methods are often used to estimate the tail 
index, where the most popular approach is the Hill estimator. The most challenging 
aspect of this method is selecting how many observations to include in the tail. 


Peaks-Over-Threshold 


LO 3.c: Describe the peaks-over-threshold (POT) approach. 


The peaks-over-threshold (POT) approach is an application of extreme value theory to 
the distribution of excess losses over a high threshold. The POT approach generally 
requires fewer parameters than approaches based on extreme value theorems. The POT 
approach provides the natural way to model values that are greater than a high 
threshold, and in this way, it corresponds to the GEV theory by modeling the maxima or 
minima of a large sample. 


The POT approach begins by defining a random variable X to be the loss. We define u as 
the threshold value for positive values of x, and the distribution of excess losses over 
our threshold u as: 


F(x + u) — F(u) 


F(x) = Pix—u < xX > u} = 1— Fo) 


This is the conditional distribution for X given that the threshold is exceeded by no 
more than x. The parent distribution of X can be normal or lognormal, however, it will 
usually be unknown. 


Generalized Pareto Distribution 


LO 3.e: Discuss the application of the generalized Pareto (GP) distribution in the 
POT approach. 


The Gnedenko-Pickands-Balkema-deHaan (GPBdH) theorem says that as u gets large, 
the distribution F,,(x) converges to a generalized Pareto (GP) distribution, such that: 
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The distribution is defined for the following regions: 
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The tail (or shape) index parameter, , is the same as it is in GEV theory. It can be 
positive, zero, or negative, but we are mainly interested in the cases when it is zero or 
positive. Here, the beta symbol, f, represents the scale parameter, which is a positive 
value. 


The GP distribution exhibits a curve that dips below the normal distribution prior to 
the tail. It then moves above the normal distribution until it reaches the extreme tail. 
The GP distribution then provides a linear approximation of the tail, which more 
closely matches empirical data. 


Since all distributions of excess losses converge to the GP distribution, it is the natural 
model for excess losses. It requires a selection of u, which determines the number of 
observations, N „in excess of the threshold value. Choosing the threshold involves a 
tradeoff. It needs to be high enough so the GPBdH theory can apply, but it must be low 
enough so that there will be enough observations to apply estimation techniques to the 
parameters. 


VaR and Expected Shortfall 


One of the goals of using the POT approach is to ultimately compute the value at risk 
(VaR). From estimates of VaR, we can derive the expected shortfall (a.k.a. conditional 
VaR). Expected shortfall is viewed as an average or expected value of all losses greater 
than the VaR. An expression for this is: E[Lp | Lp > VaR]. Because it gives an insight into 


the distribution of the size of losses greater than the VaR, it has become a popular 
measure to report along with VaR. 


The expression for VaR using POT parameters is given as follows: 


VaR = JEE ; si ec \ 
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where: 


u = threshold (in percentage terms) 
mn =number of observations 
N, = number of observations that exceed threshold 


The expected shortfall can then be defined as: 
VaR P-—ču 
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EXAMPLE: Compute VaR and expected shortfall given POT estimates 
Assume the following observed parameter values: 

= B=0.75. 

= €=0.25. 

= u=1%. 

= N,/n=5%. 


Compute the 1% VaR in percentage terms and the corresponding expected 
shortfall measure. 


Answer: 


0.25 
5 
aR = azy Labs 0.99) - 1} = 2.480% 


Generalized Extreme Value and Peaks-Over-Threshold 


LO 3.d: Compare and contrast the generalized extreme value (GEV) and POT 
approaches to estimating extreme risks. 


Extreme value theory is the source of both the GEV and POT approaches. These 
approaches are similar in that they both have a tail parameter denoted €. There is a 
subtle difference in that GEV theory focuses on the distributions of extremes, whereas 
POT focuses on the distribution of values that exceed a certain threshold. Although 
very similar in concept, there are cases where a researcher might choose one over the 
other. Here are three considerations. 

1. GEV requires the estimation of one more parameter than POT. The most popular 

approaches of the GEV can lead to loss of useful data relative to the POT. 


2. The POT approach requires a choice of a threshold, which can introduce additional 
uncertainty. 


3. The nature of the data may make one preferable to the other. 


Multivariate EVT 


LO 3.f: Explain the multivariate EVT for risk management. 


Multivariate EVT is important because we can easily see how extreme values can be 
dependent on each other. A terrorist attack on oil fields will produce losses for oil 
companies, but it is likely that the value of most financial assets will also be affected. 
We can imagine similar relationships between the occurrence of a natural disaster and 
a decline in financial markets as well as markets for real goods and services. 


Multivariate EVT has the same goal as univariate EVT in that the objective is to move 
from the familiar central-value distributions to methods that estimate extreme events. 
The added feature is to apply the EVT to more than one random variable at the same 
time. This introduces the concept of tail dependence, which is the central focus of 
multivariate EVT. Assumptions of an elliptical distribution and the use of a covariance 
matrix are of limited use for multivariate EVT. 


Modeling multivariate extremes requires the use of copulas. Multivariate EVT says that 
the limiting distribution of multivariate extreme values will be a member of the family 
of EV copulas, and we can model multivariate EV dependence by assuming one of these 
EV copulas. The copulas can also have as many dimensions as appropriate and 
congruous with the number of random variables under consideration. However, the 


increase in the dimensions will present problems. If a researcher has two independent 
variables and classifies univariate extreme events as those that occur one time ina 100, 
this means that the researcher should expect to see one multivariate extreme event (i.e., 
both variables taking extreme values) only one time in 100 x 100 = 10,000 
observations. For a trinomial distribution, that number increases to 1,000,000. This 
reduces drastically the number of multivariate extreme observations to work with, and 
increases the number of parameters to estimate. 


= MODULE QUIZ 3.1 
— 1. According to the Fisher-Tippett theorem, as the sample size n gets large, the 
distribution of extremes converges to a: 

A. normal] distribution. 

B. uniform distribution. 

C. generalized Pareto distribution. 

D. generalized extreme value distribution. 


2. The peaks-over-threshold approach generally requires: 

A. more estimated parameters than the GEV approach and shares one parameter 
with the GEV. 

B. fewer estimated parameters than the GEV approach and shares one parameter 
with the GEV. 

C. more estimated parameters than the GEV approach and does not share any 
parameters with the GEV approach. 

D. fewer estimated parameters than the GEV approach and does not share any 
parameters with the GEV approach. 


3. In setting the threshold in the POT approach, which of the following statements is the 
most accurate? Setting the threshold relatively high makes the model: 
A. more applicable but decreases the number of observations in the modeling 
procedure. 
B. less applicable and decreases the number of observations in the modeling 
procedure. 
C. more applicable but increases the number of observations in the modeling 
procedure. 
D. less applicable but increases the number of observations in the modeling 


procedure. 


4. A researcher using the POT approach observes the following parameter values: B = 
0.9, § = 0.15, u = 2%, and N,/n = 4%. The 5% VaR in percentage terms is: 


A. 1.034. 
B. 1.802. 
C. 2.204. 
D. 16.559. 


5. Given a VaR equal to 2.56, a threshold of 1%, a shape parameter equal to 0.2, and a 
scale parameter equal to 0.3, what is the expected shortfall? 
A. 3.325. 
B. 3.526. 
C. 3.777. 


D. 4.086. 


KEY CONCEPTS 


LO 3.a 

Estimating extreme values is important since they can be very costly. The challenge is 
that since they are rare, many have not even been observed. Thus, it is difficult to model 
them. 


LO 3.b 


Extreme value theory (EVT) can be used to model extreme events in financial markets 
and to compute VaR, as well as expected shortfall. 


LO 3.c 

The peaks-over-threshold (POT) approach is an application of extreme value theory. It 
models the values that occur over a given threshold. It assumes that observations 
beyond the threshold follow a generalized Pareto distribution whose parameters can be 
estimated. 


LO 3.d 


The GEV and POT approach have the same goal and are built on the same general 
principles of extreme value theory. They even share the same shape parameter: €. 


LO 3.e 

The parameters of a generalized Pareto (GP) distribution are the scale parameter f and 
the shape parameter €. Both of these can be estimated using maximum-likelihood 
technique. 


When applying the generalized Pareto distribution, the researcher must choose a 
threshold. There is a tradeoff because the threshold must be high enough so that the GP 
distribution applies, but it must be low enough so that there are sufficient observations 
above the threshold to estimate the parameters. 


LO 3.f 

Multivariate EVT is important because many extreme values are dependent on each 
other, and elliptical distribution analysis and correlations are not useful in the 
modeling of extreme values for multivariate distributions. Modeling multivariate 
extremes requires the use of copulas. Given that more than one random variable is 
involved, modeling these extremes can be even more challenging because of the rarity 
of multiple extreme values occurring at the same time. 


ANSWER KEY FOR MODULE QUIZ 


Module Quiz 3.1 


1.D The Fisher-Tippett theorem says that as the sample size n gets large, the 


distribution of extremes, denoted M,, converges to a generalized extreme value 


(GEV) distribution. (LO 3.b) 


2.B The POT approach generally has fewer parameters, but both POT and GEV 
approaches share the tail parameter €. (LO 3.c) 


3.A There is a tradeoff in setting the threshold. It must be high enough for the 
appropriate theorems to hold, but if set too high, there will not be enough 
observations to estimate the parameters. (LO 3.e) 
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The following is a review of the Market Risk Measurement and Management principles designed to address the 
learning objectives set forth by GARP®. Cross-reference to GARP assigned reading—Jorion, Chapter 6. 


READING 4 
BACKTESTING VaR 


Study Session 1 


EXAM FOCUS 


We use value at risk (VaR) methodologies to model risk. With VaR models, we seek to 
approximate the changes in value that our portfolio would experience in response to 
changes in the underlying risk factors. Model validation incorporates several methods 
that we use in order to determine how close our approximations are to actual changes 
in value. Through model validation, we are able to determine what confidence to place 
in our models, and we have the opportunity to improve their accuracy. For the exam, be 
prepared to validate approaches that measure how close VaR model approximations are 
to actual changes in value. Also, understand how the log-likelihood ratio (LR) is used to 
test the validity of VaR models for Type I and Type II errors for both unconditional and 
conditional tests. Finally, be familiar with Basel Committee outcomes that require 
banks to backtest their internal VaR models and penalize banks by enforcing higher 
capital requirements for excessive exceptions. 


MODULE 4.1: BACKTESTING VaR MODELS 


LO 4.a: Describe backtesting and exceptions and explain the importance of 
backtesting VaR models. 


Backtesting is the process of comparing losses predicted by a value at risk (VaR) 
model to those actually experienced over the testing period. It is an important tool for 
providing model validation, which is a process for determining whether a VaR model is 
adequate. The main goal of backtesting is to ensure that actual losses do not exceed 
expected losses at a given confidence level. The number of actual observations that fall 
outside a given confidence level are called exceptions. The number of exceptions falling 
outside of the VaR confidence level should not exceed one minus the confidence level. 
For example, exceptions should occur less than 5% of the time if the confidence level is 
95%. 


Backtesting is extremely important for risk managers and regulators to validate 
whether VaR models are properly calibrated or accurate. If the level of exceptions is 
too high, models should be recalibrated and risk managers should re-evaluate 


assumptions, parameters, and/or modeling processes. The Basel Committee allows 
banks to use internal VaR models to measure their risk levels, and backtesting provides 
a critical evaluation technique to test the adequacy of those internal VaR models. Bank 
regulators rely on backtesting to verify risk models and identify banks that are 
designing models that underestimate their risk. Banks with excessive exceptions (more 
than four exceptions in a sample size of 250) are penalized with higher capital 
requirements. 


LO 4.b: Explain the significant difficulties in backtesting a VaR model. 


VaR models are based on static portfolios, while actual portfolio compositions are 
constantly changing as relative prices change and positions are bought and sold. 
Multiple risk factors affect actual profit and loss, but they are not included in the VaR 
model. For example, the actual returns are complicated by intraday changes as well as 
profit and loss factors that result from commissions, fees, interest income, and bid-ask 
spreads. Such effects can be minimized by backtesting with a relatively short time 
horizon such as a daily holding period. 


Another difficulty with backtesting is that the sample backtested may not be 
representative of the true underlying risk. The backtesting period constitutes a limited 
sample, so we do not expect to find the predicted number of exceptions in every 
sample. At some level, we must reject the model, which suggests the need to find an 
acceptable level of exceptions. 


Risk managers should track both actual and hypothetical returns that reflect VaR 
expectations. The VaR modeled returns are comparable to the hypothetical return that 
would be experienced had the portfolio remained constant for the holding period. 
Generally, we compare the VaR model returns to cleaned returns (i.e., actual returns 
adjusted for all changes that arise from changes that are not marked to market, like 
funding costs and fee income). Both actual and hypothetical returns should be 
backtested to verify the validity of the VaR model, and the VaR modeling methodology 
should be adjusted if hypothetical returns fail when backtesting. 


Using Failure Rates in Model Verification 
LO 4.c: Verify a model based on exceptions or failure rates. 


Ifa VaR model were completely accurate, we would expect VaR loss limits to be 
exceeded (this is called an exception) with the same frequency predicted by the 
confidence level used in the VaR model. For example, if we use a 95% confidence level, 
we expect to find exceptions in 5% of instances. Thus, backtesting is the process of 
systematically comparing actual (exceptions) and predicted loss levels. 


The backtesting period constitutes a limited sample at a specific confidence level. We 
would not expect to find the predicted number of exceptions in every sample. How, 
then, do we determine if the actual number of exceptions is acceptable? If we expect 
five exceptions and find eight, is that too many? What about nine? At some level, we 
must reject the model, and we need to know that level. 


Failure rates define the percentage of times the VaR confidence level is exceeded in a 
given sample. Under Basel rules, bank VaR models must use a 99% confidence level, 
which means a bank must report the VaR amount at the 1% left tail level for a total of T 
days. The total number of times exceptions occur is computed as N (the sum of the 
number of times actual returns exceeded the previous day’s VaR amount). 


An unbiased measure of the number of exceptions as a proportion of the number of 
samples is called the failure rate. The probability of exception, p, equals one minus the 
confidence level (p = 1 - c). If we use N to represent the number of exceptions and T to 
represent the sample size, the failure rate is computed as N / T. This failure rate is 
unbiased if the computed p approaches the confidence level as the sample size 
increases. Non-parametric tests can then be used to see if the number of times a VaR 
model fails is acceptable or not. 


EXAMPLE: Computing the probability of exception 


Suppose a VaR of $10 million is calculated at a 95% confidence level. What is an 
acceptable probability of exception for exceeding this VaR amount? 


Answer: 


We expect to have exceptions (i.e., losses exceeding $10 million) 5% of the time (1 
- 95%). If exceptions are occurring with greater frequency, we may be 
underestimating the actual risk. If exceptions are occurring less frequently, we may 
be overestimating risk and misallocating capital as a result. 


Testing that the model is correctly calibrated requires the calculation of a z-score, 
where x is the number of actual exceptions observed. This z-score is then compared to 
the critical value at the chosen level of confidence (e.g., 1.96 for the 95% confidence 
level) to determine whether the VaR model is unbiased. 
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EXAMPLE: Model verification 


Suppose daily revenue fell below a predetermined VaR level (at the 95% 
confidence level) on 22 days during a 252-day period. Is this sample an unbiased 
sample? 


Answer: 


To answer this question, we calculate the z-score as follows: 


22 — 0.05252) _ 22— 12.6 9.4 
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Based on the calculation, this is not an unbiased sample because the computed z- 
value of 2.72 is larger than the 1.96 critical value at the 95% confidence level. In 

this case, we would reject the null hypothesis that the VaR model is unbiased and 
conclude that the maximum number of exceptions has been exceeded. 


Note that the confidence level at which we choose to reject or fail to reject a model is 
not related to the confidence level at which VaR was calculated. In evaluating the 
accuracy of the model, we are comparing the number of exceptions observed with the 
maximum number of exceptions that would be expected from a correct model at a 
given confidence level. 


Type I and Type II Errors 


LO 4.d: Identify and describe Type I and Type II errors in the context of a 
backtesting process. 


A sample cannot be used to determine with absolute certainty whether the model is 
accurate. However, we can determine the accuracy of the model and the probability of 
having the number of exceptions that we experienced. When determining a range for 
the number of exceptions that we would accept, we must strike a balance between the 
chances of rejecting an accurate model (Type I error) and the chances of failing to 
reject an inaccurate model (Type II error). The model verification test involves a 
tradeoff between Type I and Type II errors. The goal in backtesting is to create a VaR 
model with a low Type I error and include a test for a very low Type II error rate. We 
can establish such ranges at different confidence levels using a binomial probability 
distribution based on the size of the sample. 


The binomial test is used to determine if the number of exceptions is acceptable at 
various confidence levels. Banks are required to use 250 days of data to be tested at the 
99% confidence level. This results in a failure rate, or p = 0.01, of only 2.5 exceptions in 
a 250-day time horizon. Bank regulators impose a penalty in the form of higher capital 
requirements if five or more exceptions are observed. 


Figure 4.1 illustrates that we expect five or more exceptions 10.8% of the time givena 
99% confidence level. Regulators will reject a correct model or commit a Type! error 
in these cases at the far right tail. Figure 4.2 illustrates the far left tail of the 
distribution, where we evaluate Type II errors. For less than five exceptions, regulators 
will fail to reject an incorrect model at a 97% confidence level (rather than a 99% 
confidence level) 12.8% of the time. 


Figure 4.1: Type I Error (Exceptions When Model Is Correct) 


Number of Exceptions 
0.25 
0.2 
0.15 
0.1 
0.05 
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Figure 4.2: Type II Error (Exceptions When Model Is Incorrect) 
Number of Exceptions 
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Unconditional Coverage 


Kupiec (1995)! determined a measure to accept or reject models using the tail points of 
a log-likelihood ratio (LR) as follows: 


LR, = —2in{(1 — p) Np¥] + 2ing{1 — N / TY NN / DN} 


where p is the probability level, T is the sample size, N is the number of exceptions, and 
LR, is the test statistic for unconditional coverage (uc). 


The term unconditional coverage refers to the fact that we are not concerned about 
independence of exception observations or the timing of when the exceptions occur. We 
simply are concerned with the number of total exceptions. We would reject the 
hypothesis that the model is correct if the LR, > 3.84. This critical LR value is used to 
determine the range of acceptable exceptions without rejecting the VaR model at the 


95% confidence level of the log-likelihood test. Figure 4.3 provides the nonrejection 
region for the number of failures (N) based on the probability level (p), confidence level 
(c), and time period (7). 


Figure 4.3: Nonrejection Regions 


p c T = 252 T = 1,000 
0.01 99.0% <7 4<N<17 
0.025 97.5% 2<N<12 15 <N < 36 
0.05 95.0% 6<N<20 37 <N<65 


The LR, test could be used to backtest a daily holding period VaR model that was 


constructed using a 95% confidence level over a 252-day period. If the model is 
accurate, the expected number of exceptions will be 5% of 252, or 12.6. We know that 
even if the model is precise, there will be some variation in the number of exceptions 
between samples. The mean of the samples will approach 12.6 as the number of 
samples increases if the model is unbiased. However, we also know that even if the 
model is incorrect, we might still end up with the number of exceptions at or near 12.6. 


Figure 4.3 can be used to illustrate how increasing the sample size allows us to reject 
the model more easily. For example, at the 97.5% confidence, where T = 252, the test 

interval is 2 / 252 = 0.0079, 12 / 252 = 0.0476. When T is increased to 1,000, the test 

interval shrinks to 15 / 1,000 = 0.015, 36 / 1,000 = 0.036. 


Figure 4.3 also illustrates that it is difficult to backtest VaR models constructed with 
higher levels of confidence, because the number of exceptions is often not high enough 
to provide meaningful information. Notice that at the 95% confidence level, the test 
interval for T = 252 is 6 / 252 = 0.024, 20 / 252 = 0.079. With higher confidence levels 
(i.e, smaller values of p), the range of acceptable exceptions is smaller. Thus, it becomes 
difficult to determine if the model is overstating risks (i.e., fewer than expected 
exceptions) or if the number of exceptions is simply at the lower range of acceptable. 
Banks will sometimes choose to use a higher value of p such as 5%, in order to validate 
the model with a sufficient number of deviations. 


Figure 4.4 shows the calculated values of LR c with 252-day samples for three different 
VaR confidence levels and various exceptions per sample. To illustrate how Figure 4.4 
was created, the test statistic for unconditional coverage in the first row (where N = 7, 
T = 252, and p = 0.05) is computed as follows: 


LR, = —2in{(1 — 0.05)2-7(0.05)"] + 2in{[1 — (7 / 252)}°?-7(7 / 252)’} = 3.10 


The tail points of the unconditional log-likelihood ratio use a chi-squared distribution 
with one degree of freedom when T is large and the null hypothesis is that p is the true 
probability or true failure rate. As mentioned, the chi-squared test statistic is 3.84 ata 
95% confidence level. Note that the bold areas in Figure 4.4 correspond to LRs greater 
than 3.84. 


Figure 4.4: LR, Values for T = 252 


N 
c 1 2 3 4 5 6 7 8 9 10 15 20 
95.0% 18.69 14.30 10.97 8.33 6.20 4.48 3.10 2.02 1.20 0.61 0.45 3.91 
97.5% 7.03 4.09 2.19 0.99 030 0.01 0.08 0.43 1.05 190 8.94 19.59 
99.0% 120 0.12 0.09 0.75 1.92 3.50 5.42 7.64 10.12 12.83 29.19 49.15 


PROFESSOR’S NOTE 

The chi-squared test statistic is the square of the normal distribution test 
statistic. Recall that the normal distribution test statistic at a 95% confidence 
level is 1.96, so squaring this value results in 3.84. 


a) 


EXAMPLE: Testing for unconditional coverage 


Suppose that a risk manager needs to backtest a daily VaR model that was 
constructed using a 95% confidence level over a 252-day period. If the sample 
revealed 12 exceptions, should we reject or fail to reject the null hypothesis that p 
is the true probability of failure for this VaR model? 


Answer: 


We compute the test statistic as follows at the 95% confidence level (with T = 252, 
p = 0.05, and N = 12): 
LR „= —2in[(1 — 0.05)217(0.05)'7] 
+ 2in{[1 — (12 / 252)P®-2 (12 / 252)!2} = 0.03 


The LR,,, is less than the test statistic of 3.84. Therefore, we fail to reject the null 


hypothesis and the model is validated based on this sample test. We would expect 
the number of exceptions to be 12.6 (N = 0.05 x 252 = 12.6). 


Figure 4.4 illustrates that we would not reject the model at the 95% confidence level if 
the number of exceptions in our sample is greater than 6 and less than 20. For this 
example, if N was greater than or equal to 20, it would indicate that the VaR amount is 
too low and that the model understates the probability of large losses. If values of N are 
less than or equal to 6, it would indicate that the VaR model is too conservative. 


Using VaR to Measure Potential Losses 


Oftentimes, the purpose of using VaR is to measure some level of potential losses. There 
are two theories about choosing a holding period for this application. The first theory is 
that the holding period should correspond to the amount of time required to either 
liquidate or hedge the portfolio. Thus, VaR would calculate possible losses before 
corrective action could take effect. The second theory is that the holding period should 
be chosen to match the period over which the portfolio is not expected to change due 
to non-risk-related activity (e.g. trading). The two theories are not that different. For 
example, many banks use a daily VaR to correspond with the daily profit and loss 


measures. In this application, the holding period is more significant than the confidence 


level. 


=y 


MODULE QUIZ 4.1 

1. In backtesting a value at risk (VaR) model that was constructed using a 97.5% 
confidence level over a 252-day period, how many exceptions are forecasted? 
A. 2.5. 
B. 3.7. 
C. 6.3. 
D. 12.6. 


2. Unconditional testing does not reflect the: 
A. size of the portfolio. 
B. number of exceptions. 
C. confidence level chosen. 
D. timing of the exceptions. 


3. Which of the following statements regarding verification of a VaR model by examining 
its failure rates is false? 
A. The frequency of exceptions can be determined with the confidence level used for 
the model. 
B. According to Kupiec (1995), we should reject the hypothesis that the model is 
correct if the log-likelihood ratio (LR) > 3.84. 
C. Backtesting VaR models with a higher probability of exceptions is difficult because 
the number of exceptions is not high enough to provide meaningful information. 
D. The range for the number of exceptions must strike a balance between the chances 
of rejecting an accurate model (a Type I error) and the chances of failing to reject 
an inaccurate model (a Type II error). 


4. A risk manager is backtesting a sample at the 95% confidence level to see if a VaR 
model needs to be recalibrated. He is using 252 daily returns for the sample and 
discovered 17 exceptions. What is the z-score for this sample when conducting VaR 
model verification? 

A. 0.62. 
B. 1.27. 
C. 1.64. 
D. 2.86. 


MODULE 4.2: CONDITIONAL COVERAGE AND BASEL 
BACKTESTING RULES 


Conditional Coverage 


LO 4.e: Explain the need to consider conditional coverage in the backtesting 
framework. 


So far in the examples and discussion, we have been backtesting models based on 
unconditional coverage, in which the timing of our exceptions was not considered. 
Conditioning considers the time variation of the data. In addition to having a 
predictable number of exceptions, we also anticipate the exceptions to be fairly equally 


distributed across time. A bunching of exceptions may indicate that market 
correlations have changed or that our trading positions have been altered. In the event 
that exceptions are not independent, the risk manager should incorporate models that 
consider time variation in risk. 


We need some guide to determine if the bunching is random or caused by one of these 
changes. By including a measure of the independence of exceptions, we can measure 
conditional coverage of the model. Christofferson? proposed extending the 
unconditional coverage test statistic (LR,,.) to allow for potential time variation of the 
data. He developed a statistic to determine the serial independence of deviations using 
a log-likelihood ratio test (LR;,,4). The overall log-likelihood test statistic for 


conditional coverage (LR...) is then computed as: 
LR „= LR, + LR 4 


Each individual component is independently distributed as chi-squared, and the sum is 
also distributed as chi-squared. At the 95% confidence level, we would reject the model 
if LR, > 5.99 and we would reject the independence term alone if LRįpq > 3.84. If 
exceptions are determined to be serially dependent, then the VaR model needs to be 
revised to incorporate the correlations that are evident in the current conditions. 


S PROFESSOR’S NOTE 
“ For the exam, you do not need to know how to calculate the log-likelihood 
test statistic for conditional coverage. Therefore, the focus here is to 
understand that the test for conditional coverage should be performed when 
exceptions are clustered together. 


Basel Committee Rules for Backtesting 
LO 4.f: Describe the Basel rules for backtesting. 


In the backtesting process, we attempt to strike a balance between the probability of a 
Type I error (rejecting a model that is correct) and a Type II error (failing to reject a 
model that is incorrect). Thus, the Basel Committee is primarily concerned with 
identifying whether exceptions are the result of bad luck (Type I error) or a faulty 
model (Type II error). The Basel Committee requires that market VaR be calculated at 
the 99% confidence level and backtested over the past year. At the 99% confidence 
level, we would expect to have 2.5 exceptions (250 x 0.01) each year, given 
approximately 250 trading days. 


Regulators do not have access to every parameter input of the model and must 
construct rules that are applicable across institutions. To mitigate the risk that banks 
willingly commit a Type II error and use a faulty model, the Basel Committee designed 
the Basel penalty zones presented in Figure 4.5. The committee established a scale of 
the number of exceptions and corresponding increases in the capital multiplier, k. Thus, 
banks are penalized for exceeding four exceptions per year. The multiplier is normally 
three but can be increased to as much as four, based on the accuracy of the bank’s VaR 


model. Increasing k significantly increases the amount of capital a bank must hold and 
lowers the bank’s performance measures, like return on equity. 


Notice in Figure 4.5 that there are three zones. The green zone is an acceptable number 
of exceptions. The yellow zone indicates a penalty zone where the capital multiplier is 
increased by 0.40 to 0.85. The red zone, where 10 or more exceptions are observed, 
indicates the strictest penalty with an increase of 1 to the capital multiplier. 


Figure 4.5: Basel Penalty Zones 


Zone Number of Exceptions Multiplier (k) 
Green Oto4 3.00 
Yellow 5 3.40 

6 3.50 
7 3.65 
8 3-75 
9 3.85 
Red 10 or more 4.00 


As shown in Figure 4.5, the yellow zone is quite broad (five to nine exceptions). The 
penalty (raising the multiplier from three to four) is automatically required for banks 
with 10 or more exceptions. However, the penalty for banks with five to nine exceptions 
is subject to supervisors’ discretions, based on what type of model error caused the 
exceptions. The Committee established four categories of causes for exceptions and 
guidance for supervisors for each category: 


= The basic integrity of the model is lacking. Exceptions occurred because of incorrect 
data or errors in the model programming. The penalty should apply. 


= Model accuracy needs improvement. The exceptions occurred because the model does 
not accurately describe risks. The penalty should apply. 


= Intraday trading activity. The exceptions occurred due to trading activity (VaR is 
based on static portfolios). The penalty should be considered. 


= Bad luck. The exceptions occurred because market conditions (volatility and 
correlations among financial instruments) significantly varied from an accepted 
norm. These exceptions should be expected to occur at least some of the time. No 
penalty guidance is provided. 


Although the yellow zone is broad, an accurate model could produce five or more 
exceptions 10.8% of the time at the 99% confidence level. So even if a bank has an 
accurate model, it is subject to punishment 10.8% of the time (using the required 99% 
confidence level). However, regulators are more concerned about Type II errors, and 
the increased capital multiplier penalty is enforced using the 97% confidence level. At 
this level, inaccurate models would not be rejected 12.8% of the time (e.g., those with 
VaR calculated at the 97% confidence level rather than the required 99% confidence 
level). While this seems to be only a slight difference, using a 99% confidence level 
would result in a 1.24 times greater level of required capital, providing a powerful 
economic incentive for banks to use a lower confidence level. Exemptions may be 


excluded if they are the result of bad luck that follows from an unexpected change in 
interest rates, exchange rates, political event, or natural disaster. Bank regulators keep 
the description of exceptions intentionally vague to allow adjustments during major 
market disruptions. 


Industry analysts have suggested lowering the required VaR confidence level to 95% 
and compensating by using a greater multiplier. This would result in a greater number 
of expected exceptions, and variances would be more statistically significant. The one- 
year exception rate at the 95% level would be 13, and with more than 17 exceptions, 
the probability of a Type I error would be 12.5% (close to the 10.8% previously noted), 
but the probability of a Type II error at this level would fall to 7.4% (compared to 
12.8% at a 97.5% confidence level). Thus, inaccurate models would fail to be rejected 
less frequently. 


Another way to make variations in the number of exceptions more significant would be 
to use a longer backtesting period. This approach may not be as practical because the 
nature of markets, portfolios, and risk changes over time. 


2) MODULE QUIZ 4.2 


1. The Basel Committee has established four categories of causes for exceptions. Which 
of the following does not apply to one of those categories? 
A. The sample is small. 
B. Intraday trading activity. 
C. Model accuracy needs improvement. 
D. The basic integrity of the model is lacking. 


KEY CONCEPTS 


LO 4.a 


Backtesting is an important part of VaR model validation. It involves comparing the 
number of instances where the actual loss exceeds the VaR level (called exceptions) 
with the number predicted by the model at the chosen level of confidence. The Basel 
Committee requires banks to backtest internal VaR models and penalizes banks with 
excessive exceptions in the form of higher capital requirements. 


LO 4.b 


VaR models are based on static portfolios, while actual portfolio compositions are 
dynamic and incorporate fees, commissions, and other profit and loss factors. This 
effect is minimized by backtesting with a relatively short time horizon such as daily 
holding periods. The backtesting period constitutes a limited sample, and a challenge 
for risk managers is to find an acceptable level of exceptions. 


LO 4.c 


The failure rate of a model backtest is the number of exceptions divided by the number 
of observations: N / T. The Basel Committee requires backtesting at the 99% 
confidence level over the past year (250 business days). At this level, we would expect 
250 x 0.01, or 2.5 exceptions. 


LO 4.d 


In using backtesting to accept or reject a VaR model, we must balance the probabilities 
of two types of errors: a Type I error is rejecting an accurate model, and a Type II error 
is failing to reject an inaccurate model. A log-likelihood ratio is used as a test for the 
validity of VaR models. 


LO 4.e 


Unconditional coverage testing does not evaluate the timing of exceptions, while 
conditional coverage tests review the number and timing of exceptions for 
independence. Current market or trading portfolio conditions may require changes to 


the VaR model. 
LO 4.f 


The Basel Committee penalizes financial institutions when the number of exceptions 
exceeds four. The corresponding penalties incrementally increase the capital 
requirement multiplier for the financial institution from three to four as the number of 
exceptions increase. 


ANSWER KEY FOR MODULE QUIZZES 


Module Quiz 4.1 


1.C (1 - 0.975) x 252 = 6.3 
(LO 4.c) 


2.D Unconditional testing does not capture the timing of exceptions. 
(LO 4.d) 


3.C Backtesting VaR models with a lower probability of exceptions is difficult because 
the number of exceptions is not high enough to provide meaningful information. 
(LO 4.d) 
4.B The z-score is calculated using x = 17, p = 0.05, c = 0.95, and N = 252, as follows: 
17 — 0.05(252) _ 17—12.6_ _4.4 


\/0.05(0.95)252 J11.97 3.4598 
(LO 4.c) 


Module Quiz 4.2 


1.A Causes include the following: bad luck, intraday trading activity, model accuracy 
needs improvement, and the basic integrity of the model is lacking. (LO 4.f) 


1. Paul Kupiec, “Techniques for Verifying the Accuracy of Risk Measurement Models,” Journal of Derivatives, 2 
(December 1995): 73-84. 


2. PF. Christofferson, “Evaluating Interval Forecasts,” International Economic Review, 39 (1998), 841-862. 
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READING 5 
VAR MAPPING 


Study Session 1 


EXAM FOCUS 


This reading introduces the concept of mapping a portfolio and shows how the risk of a 
complex, multi-asset portfolio can be separated into risk factors. For the exam, be able 
to explain the mapping process for several types of portfolios, including fixed-income 
portfolios and portfolios consisting of linear and nonlinear derivatives. Also, be able to 
describe how the mapping process simplifies risk management for large portfolios. 
Finally, be able to distinguish between general and specific risk factors, and understand 
the various inputs required for calculating undiversified and diversified value at risk 
(VaR). 


MODULE 5.1: VaR MAPPING 


LO 5.a: Explain the principles underlying VaR mapping and describe the mapping 
process. 


Value at risk (VaR) mapping involves replacing the current values of a portfolio with 
risk factor exposures. The first step in the process is to measure all current positions 
within a portfolio. These positions are then mapped to risk factors by means of factor 
exposures. Mapping involves finding common risk factors among positions in a given 
portfolio. If we have a portfolio consisting of a large number of positions, it may be 
difficult and time consuming to manage the risk of each individual position. Instead, we 
can evaluate the value of these positions by mapping them onto common risk factors 
(e.g., changes in interest rates or equity prices). By reducing the number of variables 
under consideration, we greatly simplify the risk management process. 


Mapping can assist a risk manager in evaluating positions whose characteristics may 
change over time, such as fixed-income securities. Mapping can also provide an 
effective way to manage risk when there is not sufficient historical data for an 
investment, such as an initial public offering (IPO). In both cases, evaluating historical 
prices may not be relevant, so the manager must evaluate those risk factors that are 
likely to impact the portfolio’s risk profile. 


The principles for VaR risk mapping are summarized as follows: 


= VaR mapping aggregates risk exposure when it is impractical to consider each 
position separately. For example, there may be too many computations needed to 
measure the risk for each individual position. 


» VaR mapping simplifies risk exposures into primitive risk factors. For example, a 
portfolio may have thousands of positions linked to a specific exchange rate that 
could be summarized with one aggregate risk factor. 


= VaR risk measurements can differ from pricing methods where prices cannot be 
aggregated. The aggregation of a number of positions to one risk factor is acceptable 
for risk measurement purposes. 


= VaR mapping is useful for measuring changes over time, as with bonds or options. For 
example, as bonds mature, risk exposure can be mapped to spot yields that reflect the 
current position. 


= VaR mapping is useful when historical data is not available. 


The first step in the VaR mapping process is to identify common risk factors for 
different investment positions. Figure 5.1 illustrates how the market values (MVs) of 
each position or investment are matched to the common risk factors identified by a 
risk manager. 


Figure 5.1: Mapping Positions to Risk Factors 


Risk 
Aggregation 


Risk 


— Factor 3 


Figure 5.2 illustrates the next step, where the risk manager constructs risk factor 
distributions and inputs all data into the risk model. In this case, the market value of 
the first position, MV,, is allocated to the risk exposures in the first row, X11, X12 and 
X13. The other market value positions are linked to the risk exposures in a similar way. 


Summing the risk factors in each column then creates a vector consisting of three risk 
exposures. 


Figure 5.2: Mapping Risk Exposures 


Investment Market Value Risk Factor 1 Risk Factor 2 Risk Factor 3 


1 MV, X14 Xy X43 
2 M if 3 xX, 1 X 2 X 3 
3 My 3 X33 Xa X33 
4 MY 4 X41 X49 X43 
5 MV; X51 X52 X53 


LO 5.b: Explain and demonstrate how the mapping process captures general and 
specific risks. 


So how many general risk factors (or primitive risk factors) are appropriate for a 
given portfolio? In some cases, one or two risk factors may be sufficient. Of course, the 
more risk factors chosen, the more time consuming the modeling of a portfolio 
becomes. However, more risk factors could lead to a better approximation of the 
portfolio’s risk exposure. 


In our choice of general risk factors for use in VaR models, we should be aware that the 
types and number of risk factors we choose will have an effect on the size of residual or 
specific risks. Specific risks arise from unsystematic risk or asset-specific risks of 
various positions in the portfolio. The more precisely we define risk, the smaller the 
specific risk. For example, a portfolio of bonds may include bonds of different ratings, 
terms, and currencies. If we use duration as our only risk factor, there will be a 
significant amount of variance among the bonds that we referred to as specific risk. If 
we add a risk factor for credit risk, we could expect that the amount of specific risk 
would be smaller. If we add another risk factor for currencies, we would expect that the 
specific risk would be even smaller. Thus, the definition of specific risk is a function of 
general market risk. 


As an example, suppose an equity portfolio consists of 5,000 stocks. Each stock has a 
market risk component and a firm-specific component. If each stock has a 
corresponding risk factor, we would need roughly 12.5 million covariance terms (i.e., 
[5,000 x (5,000 - 1)] / 2) to evaluate the correlation between each risk factor. To 
simplify the number of parameters required, we need to understand that diversification 
will reduce firm-specific components and leave only market risk (i.e., systematic risk 
or beta risk). We can then map the market risk component of each stock onto a stock 
index (i.e., changes in equity prices) to greatly reduce the number of parameters needed. 


Suppose you have a portfolio of N stocks and map each stock to the market index, 

which is defined as your primitive risk factor. The risk exposure, B,, is computed by 

regressing the return of stock i on the market index return using the following equation: 
Ri =a, +t BRM tE 


We can ignore the first term (i.e., the intercept) as it does not relate to risk, and we will 
also assume that the last term, which is related to specific risk, is not correlated with 


other stocks or the market portfolio. If the weight of each position in the portfolio is 
defined as w,, then the portfolio return is defined as follows: 

N N N 
R, = 2 vw:R; = D w:8:Ru + È wis 
i= i— 


i= 


Aggregating all risk exposures, B;, based on the market weights of each position 
determines the risk exposure as follows: 


We can then decompose the variance, V, of the portfolio return into two components, 
which consist of general market risk exposures and specific risk exposures, as follows: 


VR,)=3 x VR) + È w? x o2; 


ii 
General market risk: B x V(Ryp) 
Specific risk: w? x o?. 
i=1 = 
2) MODULE QUIZ 5.1 
- 1. Which of the following could be considered a general risk factor? 


I. Exchange rates. 
II. Zero-coupon bonds. 


A. I only. 

B. II only. 

C. Both I and II. 
D. Neither I nor II. 


MODULE 5.2: MAPPING FIXED-INCOME SECURITIES 


LO 5.c: Differentiate among the three methods for mapping portfolios of fixed- 
income securities. 


After we have selected our general risk factors, we must map our portfolio onto these 

factors. The three methods of mapping for fixed-income securities are (1) principal 

mapping, (2) duration mapping, and (3) cash flow mapping. 

= Principal mapping. This method includes only the risk of repayment of principal 
amounts. For principal mapping, we consider the average maturity of the portfolio. 
VaR is calculated using the risk level from the zero-coupon bond that equals the 
average maturity of the portfolio. This method is the simplest of the three 
approaches. 


= Duration mapping. With this method, the risk of the bond is mapped to a zero- 
coupon bond of the same duration. For duration mapping, we calculate VaR by using 
the risk level of the zero-coupon bond that equals the duration of the portfolio. Note 
that it may be difficult to calculate the risk level that exactly matches the duration of 
the portfolio. 


= Cash flow mapping. With this method, the risk of the bond is decomposed into the 
risk of each of the bond’s cash flows. Cash flow mapping is the most precise method 
because we map the present value of the cash flows (i.e., face amount discounted at 
the spot rate for a given maturity) onto the risk factors for zeros of the same 
maturities and include the inter-maturity correlations. 


LO 5.d: Summarize how to map a fixed-income portfolio into positions of 
standard instruments. 


To illustrate principal, duration, and cash flow mapping, we will use a two position 
fixed-income portfolio consisting of a one-year bond and a five-year bond. You will 
notice in the following examples that the primary difference between these mapping 
techniques is the consideration of the timing and amount of cash flows. 


Suppose a portfolio consists of two par value bonds. One bond is a one-year $100 
million bond with a coupon rate of 3.5%. The second bond is a five-year $100 million 
bond with a coupon rate of 5%. In this example, we will differentiate between the 
timing and cash flows used to map the VaR for this portfolio using principal mapping, 
duration mapping, and cash flow mapping. The risk percentages (or VaR percentages) 
for zero-coupon bonds with maturities ranging from one to five years (at the 95% 
confidence level) are as follows: 


Maturity VaR % 
0.4696 
0.9868 
1.4841 
1.9714 
2.4261 


wn A Ww N m- 


Principal mapping is the simplest of the three techniques as it only considers the 
timing of the redemption or maturity payments of the bonds. While this simplifies the 
process, it ignores all coupon payments for the bonds. The weights in this example are 
both 50% (i.e. $100 million / $200 million). Thus, the weighted average life of this 
portfolio for the two bonds is three years [0.50(1) + 0.50(5) = 3]. 


As Figure 5.3 illustrates, the principal mapping technique assumes that the total 
portfolio value of $200 million occurs at the average life of the portfolio, which is three 
years. Note that the VaR percentage at the 95% confidence level is 1.4841 for a three- 
year zero-coupon bond. We compute the VaR under the principal method by 
multiplying the VaR percentage times the market value of the average life of the bond, 
as follows: 


Principal mapping VaR = $200 million x 1.4841% = $2.968 million 


Figure 5.3: Fixed-Income Mapping Techniques 


CFs for CFs for Mapping Technique 
5-Year 1-Year Sa on a 
Year Bond Bond Rates Principal Duration PV(CF) 
1 $5 $103.5 3.50% $104.83 
2 $5 $0 3.90% $4.63 
2.768 $200 
3 $5 $0 4.19%, $200 $4.42 
E $5 $0 4.21% $4.24 
5 $105 $0 5.10% $81.88 
$200 $200 $200.00 


In the last three columns of Figure 5.3, you can see the differences in the amounts and 
timing of cash flows for all three methods. To calculate the VaR of this fixed-income 
portfolio using duration mapping, we simply replace the portfolio with a zero-coupon 
bond that has the same maturity as the duration of the portfolio. Figure 5.4 
demonstrates the calculation of Macaulay duration for this portfolio. The numerator of 
the duration calculation is the sum of time, t, multiplied by the present value of cash 
flows, and the denominator is simply the present value of all cash flows. Duration is 
then computed as $553.69 million / $200 million = 2.768. 


Figure 5.4: Duration Calculation 


CF for CF for 
5-Year 1-Year 
Year Bond Bond Spot Rate PV(CF) t x PV(CF) 

1 $5 $103.5 3.50% $104.83 $104.83 
2 $5 $0 3.90% $4.63 $9.26 
3 $5 $0 4.19% $4.42 $13.26 
4 $5 $0 4.21% $4.24 $16.96 
5 $105 $0 5.10% $81.88 $409.38 


$200.00 $553.69 


The next step is to interpolate the VaR for a zero-coupon bond with a maturity of 2.768 
years. Recall that the VaR percentages for two-year and three-year zero-coupon bonds 
were 0.9868 and 1.4841, respectively. 
The VaR of a 2.768 year maturity zero-coupon bond is interpolated as follows: 
0.9868 + (1.4841 — 0.9868) x (2.768 — 2) = 0.9868 + (0.4973 x 0.768) = 1.3687 
We now have the information needed to calculate the VaR for this portfolio using the 
interpolated VaR percentage for a zero-coupon bond with a 2.768 year maturity: 
Duration mapping VaR = $200 million x 1.3687% = $2.737 million 


In order to calculate the VaR for this fixed-income portfolio using cash flow mapping, 
we need to map the present value of the cash flows (i.e., face amount discounted at the 


spot rate for a given maturity) onto the risk factors for zeros of the same maturities 
and include the inter-maturity correlations. Figure 5.5 summarizes the required 
calculations. The second column of Figure 5.5 provides the present value of cash flows 
that were computed in Figure 5.3. The third column of Figure 5.5 multiplies the present 
value of cash flows times the zero-coupon VaR percentages. 


Figure 5.5: Cash Flow Mapping 


Correlation Matrix (R) 


Year x 
1 104.83 0.894 0.887 0.871 
2 4.63 1 0.99 0.964 
3 4.47 0.99 1 0.992 
E 4.24 0.964 0.992 1 
5 $1.88 0.954 0.987 0.996 
Undiversified 
VaR 


Diversified VaR | 2.615 


If the five zero-coupon bonds were all perfectly correlated, then the undiversified VaR 
could be calculated as follows: 


N 
Undiversified VaR = |x, 


=1 


xV 
t 


In this example, the undiversified VaR is computed as the sum of the third column: 
2.674. 


The correlation matrix provided in the fourth through eighth columns of Figure 5.5 
provides the inter-maturity correlations for the zero-coupon bonds for all five 
maturities. The diversified VaR can be computed using matrix algebra as follows: 


Diversified VaR = av/x> x= Va x V)R(xx V) 


Where x is the present value of cash flows vector, V is the vector of VaR for zero-coupon 
bond returns and R is the correlation matrix. The last column of Figure 5.5 summarizes 
the computations for the matrix algebra. The square root of the sum of this column 
(6.840) is the diversified VaR using cash flow mapping and is calculated as 2.615. 


Notice that in order to calculate portfolio diversified VaR using the cash flow mapping 
method, we need to incorporate the correlations between the zero-coupon bonds. As 
you can see, cash flow mapping is the most precise method, but it is also the most 
complex. 


iSi PROFESSOR’S NOTE 
The complex calculations required for cash flow mapping would be very time 
consuming to perform using a financial calculator. Therefore, this calculation 
it is highly unlikely to show up on the exam. 


= MODULE QUIZ 5.2 
— 1. Which of the following methods is not one of the three approaches for mapping a 
portfolio of fixed-income securities onto risk factors? 
A. Principal mapping. 
B. Duration mapping. 
C. Cash flow mapping. 
D. Present value mapping. 


2. If portfolio assets are perfectly correlated, portfolio VaR will equal: 


A. marginal VaR. 

B. component VaR. 
C. undiversified VaR. 
D. diversified VaR. 


3. The VaR percentages at the 95% confidence level for a bond with maturities ranging 
from one year to five years are as follows: 


Maturity VaR % 
1 0.4696 
2 0.9868 
3 1.4841 
4 1.9714 
5 2.4261 


A bond portfolio consists of a $100 million bond maturing in two years and a $100 
million bond maturing in four years. What is the VaR of this bond portfolio using the 
principal VaR mapping method? 

A. $1.484 million. 

B. $1.974 million. 

C. $2.769 million. 

D. $2.968 million. 


MODULE 5.3: STRESS TESTING, PERFORMANCE 
BENCHMARKS, AND MAPPING DERIVATIVES 


Stress Testing 


LO 5.e: Describe how mapping of risk factors can support stress testing. 


If we assume that there is perfect correlation among maturities of the zeros, the 
portfolio VaR would be equal to the undiversified VaR (i.e., the sum of the VaRs, as 
illustrated in the third column of Figure 5.5). Instead of calculating the undiversified 
VaR directly, we could reduce each zero-coupon value by its respective VaR and then 
revalue the portfolio. The difference between the revalued portfolio and the original 
portfolio value should be equal to the undiversified VaR. Stressing each zero by its VaR 
is a simpler approach than incorporating correlations; however, this method ceases to 
be viable if correlations are anything but perfect (i.e. 1). 


Using the same two-bond portfolio from the previous example, we can stress test the 
VaR measurement, assuming all zeros are perfectly correlated, and derive movements 


in the value of zero-coupon bonds. Figure 5.6 illustrates the calculations required to 
stress test the portfolio. The present value factor for a one-year zero-coupon bond 
discounted at 3.5% is simply 1 / (1.035) = 0.9662. The VaR percentage movement at the 
95% confidence level for a one-year zero-coupon bond is provided in the fifth column 
(0.4696). Thus, there is a 95% probability that a one-year zero-coupon bond will fall to 
0.9616 [computed as follows: 0.9662 x (1 - 0.4696 / 100) = 0.9616]. 


The VaR adjusted present values of zero-coupon bonds are presented in the seventh 
column of Figure 5.6. The last column simply finds the present value of the portfolio’s 
cash flows using the VaR% adjusted present value factors. The sum of these values 
suggests that the change in portfolio value is $2.67 (computed $200.00 - $197.33). 
Notice that the $2.67 is equivalent to the undiversified VaR previously computed in 
Figure 5.5. 


Figure 5.6: Stress Testing a Portfolio 


VaR New 
Portfolio Spot PV Adj. PV Zero 
Year CF Rate PV(CF) VaR% Factor Factor Value 


1 $108.5 3.50% $104.83 0.4696 0.9662 0.9616 $104.34 
2 $5 3.90% $4.63 0.9868 0.9263 0.9172 $4.59 
3 $5 4.19% $4.42 1.4841 0.8841 0.8710 $4.36 
+ $5 4.21% $4.24 1.9714 0.8479 0.8312 $4.16 
5 $105 5.10% $81.88 2.4261 0.7798 0.7609 $79.89 

$200.00 $197.33 


Benchmarking a Portfolio 


LO 5.f: Explain how VaR can be computed and used relative to a performance 
benchmark. 


It is often convenient to measure VaR relative to a benchmark portfolio. This is what is 
referred to as benchmarking a portfolio. Portfolios can be constructed that match the 
risk factors of a benchmark portfolio but have either a higher or a lower VaR. The VaR 
of the deviation between the two portfolios is referred to as a tracking error VaR. In 
other words, tracking error VaR is a measure of the difference between the VaR of the 
target portfolio and the benchmark portfolio. 


Suppose you are trying to benchmark the VaR of a $100 million bond portfolio with a 
duration of 4.77 to a portfolio of two zero-coupon bonds with the same duration at the 
95% confidence level. The market value weights of the bonds in the benchmark 
portfolio and portfolios of two zero-coupon bonds are provided in Figure 5.7. 


Figure 5.7: Benchmark Portfolio and Zero-Coupon Bond Portfolio Weights 


Maturity Benchmark A B C D E 


Total Value 100.00 100.00 100.00 100.00 100.00 100.00 


The first step in the benchmarking process is to match the duration with two zero- 
coupon bonds. Therefore, the weights of the market values of the zero-coupon bonds in 
Figure 5.7 are adjusted to match the benchmark portfolio duration of 4.77. Figure 5.8 
illustrates the creation of five two-bond portfolios with a duration of 4.77. The market 
values of all bonds in the zero-coupon portfolios are adjusted to match the duration of 
the benchmark portfolio. For example, Portfolio A in Figure 5.7 and Figure 5.8 consists 
of a four-year zero-coupon bond with a market weight of 23% and a five-year zero- 
coupon bond with a market weight of 77%. This results in a duration for Portfolio A of 
4.77, which is equivalent to the benchmark. The other zero-coupon bond portfolios also 
adjust their weights of the two zero-coupon bonds to match the benchmark’s duration. 


Figure 5.8: Matching Duration of Zero-Coupon Bond Portfolios to Benchmark 


Time Benchmark A B C D E 
1 month 
3 month 
6 month 
1 year 
2 year 
3 year 
4 year 
5 year 
7 year 
9 year 
10 year 
15 year 
20 year 


30 year 
Duration 4.77 4.77 4.77 4.77 4.77 4.77 


Figure 5.9 presents the absolute VaR by multiplying the market value weights of the 
bonds (presented in Figure 5.7) by the VaR percentages presented in the second column 
of Figure 5.9. The VaR percentages are for a monthly time horizon. The absolute VaR for 
the benchmark portfolio is computed as $1.99 million. Notice this is very close to the 
VaR percentage for the four-year note in Figure 5.9. 


Next, the absolute VaR for the five portfolios each consisting of two zero-coupon bonds 
is computed by multiplying the VaR percentage times the market value of the zero- 
coupon bonds. We define the new vector of market value positions for each zero- 
coupon bond portfolio presented in Figure 5.7 as x and the vector of market value 
positions of the benchmark as Xp. Then the relative performance to the benchmark is 


computed as the tracking error VaR as follows: 


Tracking error VaR = a y(x — Xp). (x — Xp) 


The tracking error or difference between the VaR for the benchmark and zero-bond 
portfolios is due to nonparallel shifts in the term structure of interest rates. However, 
the tracking error of $0.45 million for zero-coupon bond Portfolio A and the 
benchmark is much less than the VaR for the benchmark at $1.99. In this example, the 
smallest tracking error is for Portfolio C. Notice that the benchmark portfolio has the 
largest market weight in the two-year note. Thus, the cash flows are most closely 
aligned with Portfolio C, which contains a two-year zero-coupon bond. This reduces the 
tracking error to $0.17 million for that portfolio. Also notice that minimizing the 
absolute VaR in Figure 5.9 is not the same as minimizing the tracking error. Portfolio E 
is a barbell portfolio with the highest tracking error to the index, even though it has the 
lowest absolute VaR. 


Tracking error can be used to compute the variance reduction (similar to R-squared in 
a regression) as follows: 


Variance improvement = 1 — (tracking error / benchmark VaR) 

Variance improvement for Portfolio C relative to the benchmark is computed as: 
1 - (0.17 / 1.99)? = 99.3% 

Figure 5.9: Absolute VaR and Tracking Error Relative to Benchmark Portfolio 


Time VaR% Benchmark A B Cc D E 
l month 0.022 
3month 0.065 
ó month 0.163 

1 year 0.47 

2 year 0.987 

3 year 1.484 

4 year 1.971 


5 year 2.426 
7 year 3.192 
9 year 3.913 
10 year 4.25 

15 year 6.234 
20 year 8.146 
30 year 11.119 


Absolute VaR 1.99 2.32 2.24 2.14 2.05 1.76 
Tracking Error VaR 0.00 0.45 0.31 0.17 0.21 0.84 


LO 5.g: Describe the method of mapping forwards, forward rate agreements, 
interest rate swaps, and options. 


Mapping Approaches for Linear Derivatives 


Forward Contracts 


The delta-normal method provides accurate estimates of VaR for portfolios and assets 
that can be expressed as linear combinations of normally distributed risk factors. Once 
a portfolio, or financial instrument, is expressed as a linear combination of risk factors, 
a covariance (correlation) matrix can be generated, and VaR can be measured using 
matrix multiplication. 


Forwards are appropriate for the application of the delta-normal method. Their values 
are a linear combination of a few general risk factors, which have commonly available 
volatility and correlation data. 


The current value of a forward contract is equal to the present value of the difference 
between the current forward rate, F, and the locked in delivery rate, K, as follows: 


Forward, = (F, -~Kje* 


Suppose you wish to compute the diversified VaR of a forward contract that is used to 
purchase euros with U.S. dollars one year from now. This forward position is analogous 


to the following three separate risk positions: 
1. A short position in a U.S. Treasury bill. 


2. A long position in a one-year euro bill. 


3. A long position in the euro spot market. 


Figure 5.10 presents the pricing information for the purchase of $100 million euros in 
exchange for $126.5 million, as well as the correlation matrix between the positions. 


Figure 5.10: Monthly VaR for Forward Contract and Correlation Matrix 


Risk Factor Price/Rate VaR% EUR Spot 1YrEUR 1YrUS 
EUR spot 1.2500 4.5381 1.000 0.115 0.073 
Long EUR bill 0.0170 0.1396 0.115 1.000 —0.047 
Short USD bill 0.0292 0.2121 0.073 —0.047 1.000 


EUR forward 1.2650 


In this example, we have a long position in a EUR contract worth $122.911 million 
today and a short position in a one-year U.S. T-bill worth $122.911 today, as illustrated 
in Figure 5.11. The fourth column represents the investment present values. The fifth 
column represents the absolute present value of cash flows multiplied by the VaR 


percentage from Figure 5.10. 


Figure 5.11: Undiversified and Diversified VaR of Forward Contract 


Position PV Factor CF x |x;| V; xAVaR 
EUR spot 122.911 5.578 31.116 
Long EUR bill 0.9777 100.0 122.911 0.172 0.142 
Short USD bill 0.9678 126.5 122.911 0.261 0.036 
Undiversified VaR 6.010 31.221 
Diversified VaR 5.588 


*Note that some rounding has occurred. 


The undiversified VaR for this position is $6.01 million, and the diversified VaR for this 
position is $5.588 million. Recall that the diversified VaR is computed using matrix 
algebra. 


The general procedure we've outlined for forwards also applies to other types of 
financial instruments, such as forward rate agreements and interest rate swaps. As long 
as an instrument can be expressed as linear combinations of its basic components, the 


delta-normal VaR may be applied with reasonable accuracy. 


Forward Rate Agreements (FRAs) 

Suppose you have an FRA that locks in an interest rate one year from now. Figure 5.12 
illustrates data related to selling a 6 x 12 FRA on $100 million. This amount is 
equivalent to borrowing $100 million for a 6-month period (180 days) and investing 
the proceeds at the 12-month rate (360 days). Assuming that the 360-day spot rate is 
4.5% and the 180-day spot rate is 4.1%, the present values of the cash flows are 


presented in the second column of Figure 5.12. The present value of the notional $100 
million contract is x = $100 / 1.0205 = $97.991 million. This will be invested for a 12- 
month period. The forward rate is then computed as follows: (1 + F4 2/ 2) = [1.045 / (1 + 


0.041 / 2)] = [(1.045 / 1.0205) - 1] x 2 = 4.8%. 


The sixth column computes the undiversified VaR of $0.62 million at the 95% 
confidence level using the VaR percentages in the third column multiplied by the 
absolute value of the present values of cash flows. Matrix algebra is then used to 
multiply this vector by the correlation matrix presented in columns four and five to 
compute the diversified VaR of $0.348 million. 


Figure 5.12: Calculating VaR for an FRA 


Position PV(CF),x VaR% Correlations (R) |x;] V; xAVaR 
180 days 97.991 0.1629 1 0.79 0.160 0.0325 
360 days 97.991 0.4696 0.79 1 0.460 0.1537 
Undiversified VaR 0.620 

0.1212 
Diversified VaR 0.348 


Interest Rate Swaps 


Interest rate swaps are commonly used to exchange interest rates from fixed to floating 
rates or from floating to fixed rates. Thus, an interest rate swap can be broken down 
into fixed and floating parts. The fixed part is priced with a coupon-paying bond and the 
floating part is priced as a floating-rate note. 


Suppose you want to compute the VaR of a $100 million four-year swap that pays a 
fixed rate for four years in exchange for a floating-rate payment. The necessary steps to 
compute the undiversified and diversified VaR amounts are as follows: 


Step 1: Begin by creating a present value of cash flows showing the short position of the 
fixed portion as we agree to pay the fixed interest rates and fixed bond maturity. 
Then, add the long present value of the variable rate bond at a present value of 
$100 million today. 


Step 2: Multiply the vector representing the absolute present values of cash flows by 
the VaR percentages at the 95% confidence level and sum the values to compute 
the undiversified VaR amount. 


Step 3: Use matrix algebra to multiply the correlation matrix by the absolute values to 
compute the diversified VaR amount. Again, recall that the diversified VaR is 
computed using matrix algebra. 


Mapping Approaches for Nonlinear Derivatives 


As mentioned, the delta-normal VaR method is based on linear relationships between 
variables. Options, however, exhibit nonlinear relationships between movements of the 
values of the underlying instruments and the values of the options. In many cases, the 


delta-normal method may still be applied because the value of an option may be 
expressed linearly as the product of the option delta and the underlying asset. 


Unfortunately, the delta-normal VaR cannot be expected to provide an accurate 
estimate of the true VaR over ranges where deltas are unstable. In other words, over 
longer periods of time, the delta is not a constant, which makes linear methods 
inappropriate. Conversely, over short periods of time, such as one day, a linear 
approximation of the delta is more accurate. However, the accuracy of this 
approximation is dependent on parameter inputs (i.e., delta increases with the 
underlying spot price). 


For example, assume the strike price of an option is $100 with a volatility of 25%. If we 
are only concerned about a one-day risk horizon, then the one-day loss could be 
computed as follows: 


aSo VT = —1.645 x $100 x 0.25 x y = —$2.59 


25 


bo 


Thus, over a one-day horizon, the worst-case scenario at the 95% confidence level is a 
loss of $2.59, which brings the position down to $97.41. Linear approximations using 
this method may be reliable for longer maturity options if the risk horizon is very 
short, such as a one-day time horizon. 


iSi PROFESSOR’S NOTE 
Options are usually mapped using a Taylor series approximation and using the 
delta-gamma method to calculate the option VaR. 


2) MODULE QUIZ 5.3 


1. Suppose you are calculating the tracking error VaR for two zero-coupon bonds using a 
$100 million benchmark bond portfolio with the following maturities and market 
value weights. Which of the following combinations of two zero-coupon bonds would 
most likely have the smallest tracking error? 


Maturity Benchmark 

1 month 1.00 
1 year 10.00 
2 year 13.00 
3 year 24.00 
4 year 12.00 
5 year 18.00 
7 year 9.25 
10 year 6.50 
20 year 4.75 
30 year 1.50 


A. 1 year and 7 year. 
B. 2 year and 4 year. 
C. 3 year and 5 year. 


D. 4 year and 7 year. 


KEY CONCEPTS 


LO 5.a 


Value at risk (VaR) mapping involves replacing the current values of a portfolio with 
risk factor exposures. Portfolio exposures are broken down into general risk factors and 
mapped onto those factors. 


LO 5.b 
Specific risk decreases as more risk factors are added to a VaR model. 


LO 5.c 


Fixed-income risk mapping methods include principal mapping, duration mapping, and 
cash flow mapping. Principal mapping considers only the principal cash flow at the 
average life of the portfolio. Duration mapping considers the market value of the 
portfolio at its duration. Cash flow mapping is the most complex method considering 
the timing and correlations of all cash flows. 


LO 5.d 
The primary difference between principal, duration, and cash flow mapping techniques 
is the consideration of the timing and amount of cash flows. 


Undiversified VaR is calculated as: 
N 
Undiversified VaR = > |x,| xV; 
ri 


Diversified VaR is computed using matrix algebra as follows: 
Diversified VaR = av/z') x= V(x x VY R(x x V) 


LO 5.e 


Stress testing each zero-coupon bond by its VaR is a simpler approach than 
incorporating correlations; however, this method ceases to be viable if correlations are 
anything other than 1. 


LO 5.f 


A popular use of VaR is to establish a benchmark portfolio and measure VaR of other 
portfolios in relation to this benchmark. The tracking error VaR is smallest for 
portfolios most closely matched based on cash flows. 


LO 5.g 


Delta-normal VaR can be applied to portfolios of many types of instruments as long as 
the risk factors are linearly related. Application of the delta-normal method with 


options and other derivatives does not provide accurate VaR measures over long risk 
horizons in which deltas are unstable. 


ANSWER KEY FOR MODULE QUIZZES 
Module Quiz 5.1 


1.A Exchange rates can be used as general risk factors. Zero-coupon bonds are used to 
map bond positions but are not considered a risk factor. However, the interest 
rate on those zeros is a risk factor. (LO 5.b) 


Module Quiz 5.2 


1.D Present value mapping is not one of the approaches. (LO 5.c) 


2.C If we assume perfect correlation among assets, VaR would be equal to 
undiversified VaR. (LO 5.d) 


3.D The VaR percentage is 1.4841 for a three-year zero-coupon bond [(2 + 4) / 2 = 3]. 
We compute the VaR under the principal method by multiplying the VaR 
percentage times the market value of the average life of the bond: principal 
mapping VaR = $200 million x 1.4841% = $2.968 million. (LO 5.d) 


Module Quiz 5.3 


1.C The three-year and five-year cash flows are highest for the benchmark portfolio at 
$24 million and $18 million, respectively. Thus, tracking error VaR will likely be 
the lowest for the portfolio where the cash flows of the benchmark and zero- 
coupon bond portfolios are most closely matched. (LO 5.f) 
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Study Session 1 


EXAM FOCUS 


This reading addresses tools for risk measurement, including value at risk (VaR) and 
expected shortfall. Specifically, we will examine VaR implementation over different 
time horizons and VaR adjustments for liquidity costs. This reading also examines 
academic studies related to integrated risk management and discusses the importance 
of measuring interactions among risks due to risk diversification. Note that several 
concepts in this reading, such as liquidity risk, stressed VaR, and capital requirements, 
will be discussed in more detail in Books 3 and 4. 


MODULE 6.1: RISK MEASUREMENT FOR THE 
TRADING BOOK 


Value at Risk (VaR) Implementation 


LO 6.a: Explain the following lessons on VaR implementation: time horizon over 
which VaR is estimated, the recognition of time-varying volatility in VaR risk 
factors, and VaR backtesting. 


There is no consensus regarding the proper time horizon for risk measurement. The 
appropriate time horizon depends on the risk measurement purpose (e.g. setting 
capital limits) as well as portfolio liquidity. Thus, there is not a universally accepted 
approach for aggregating various VaR measures based on different time horizons. 


Time-varying volatility results from volatility fluctuations over time. The effect of 
time-varying volatility on the accuracy of VaR measures decreases as time horizon 


increases. However, volatility generated by stochastic (i.e. random) jumps will reduce 
the accuracy of long-term VaR measures unless there is an adjustment made for 
stochastic jumps. It is important to recognize time-varying volatility in VaR measures 
since ignoring it will likely lead to an underestimation of risk. In addition to volatility 
fluctuations, risk managers should also account for time-varying correlations when 
making VaR calculations. 


To simplify VaR estimation, the financial industry has a tendency to use short time 
horizons. This approach is computationally attractive for larger portfolios. However, a 
10-day VaR time horizon, as suggested by the Basel Committee on Banking Supervision, 
is not always optimal. It is more preferred to instead allow the risk horizon to vary 
based on specific investment characteristics. When computing VaR over longer time 
horizons, a risk manager needs to account for the variation in a portfolio’s composition 
over time. Thus, a longer than 10-day time horizon may be necessary for economic 
capital purposes. 


Historically, VaR backtesting has been used to validate VaR models. However, 
backtesting is not effective when the number of VaR exceptions is small. In addition, 
backtesting is less effective over longer time horizons due to portfolio instability. VaR 
models tend to be more realistic if time-varying volatility is incorporated; however, 
this approach tends to generate a procyclical VaR measure and produces unstable risk 
models due to estimation issues. 


Integrating Liquidity Risk Into VaR Models 


LO 6.b: Describe exogenous and endogenous liquidity risk and explain how they 
might be integrated into VaR models. 


During times of a financial crisis, market liquidity conditions change, which changes 
the liquidity horizon of an investment (i.e., the time to unwind a position without 
materially affecting its price). Two types of liquidity risk are exogenous liquidity and 
endogenous liquidity. Both types of liquidity are important to measure; however, 
academic studies suggest that risk valuation models should first account for the impact 
of endogenous liquidity. 


Exogenous liquidity is handled through the calculation of a liquidity-adjusted VaR 
(LVaR) measure, and represents market-specific, average transaction costs. The LVaR 
measure incorporates a bid/ask spread by adding liquidity costs to the initial estimate 
of VaR. 


Endogenous liquidity is an adjustment for the price effect of liquidating positions. It 
depends on trade sizes and is applicable when market orders are large enough to move 
prices. Endogenous liquidity is the elasticity of prices to trading volumes and is more 
easily observed in instances of high liquidity risk. 


Poor market conditions can cause a “flight to quality,’ which decreases a trader’s ability 
to unwind positions in thinly traded assets. Thus, endogenous liquidity risk is most 
applicable to exotic/complex trading positions and very relevant in high-stress market 


conditions, however, endogenous liquidity costs will be present in all market 
conditions. 


Risk Measures 
LO 6.c: Compare VaR, expected shortfall, and other relevant risk measures. 


VaR estimates the maximum loss that can occur given a specified level of confidence. 
VaR is a useful measure of risk since it is easy to compute and readily applicable. 
However, it does not consider losses beyond the VaR confidence level (i.e., the threshold 
level). In other words, VaR does not consider the severity of losses in the tail of the 
returns distribution. An additional disadvantage of VaR is that it is not subadditive, 
meaning that the VaR of a combined portfolio can be greater than the sum of the VaRs 
of each asset within the portfolio. 


An alternative risk measure, frequently used by financial institutions, is expected 
shortfall. Expected shortfall is more complex and computationally intensive than VaR, 
however, it does correct for some of the drawbacks of VaR. Namely, it is able to account 
for the magnitude of losses beyond the VaR threshold and it is always subadditive. In 
addition, the application of expected shortfall will mitigate the impact that a specific 
confidence level choice will have on risk management decisions. 


Spectral risk measures generalize expected shortfall and consider an investment 
manager’s aversion to risk. These measures have select advantages over expected 
shortfall by including better smoothness properties when weighting observations as 
well as the ability to modify a risk measure to reflect an investor’s specific risk 
aversion. Aside from the special case of expected shortfall, other spectral risk measures 
are rarely used in practice. 


Stress Testing 


It is important to incorporate stress testing into risk models by selecting various stress 
scenarios. Three primary applications of stress testing exercises are as follows: 


1. Historical scenarios, which examine previous market data. 


2. Predefined scenarios, which attempt to assess the impact on profit/loss of adverse 
changes in a predetermined set of risk factors. 


3. Mechanical-search stress tests, which use automated routines to cover possible 
changes in risk factors. 


In stress testing, it is important to “stress” the correlation matrix. However, an 
unreasonable assumption related to stress testing is that market shocks occur instantly 
and that traders cannot re-hedge or adjust their positions. 


When VaR is computed and analyzed, it is generally under more normalized market 
conditions, so it may not be accurate ina more stressful environment. A stressed VaR 
approach, which attempts to account for a significantly financial stressed period, has 
not been thoroughly tested or analyzed. Thus, VaR could lead to inaccurate risk 
assessment under market stresses. 


Integrated Risk Measurement 
LO 6.d: Compare unified and compartmentalized risk measurement. 


Unified and compartmentalized risk measurement methods aggregate risks for banks. A 
compartmentalized approach sums risks separately, whereas a unified, or integrated, 
approach considers the interaction among risks. 


A unified approach considers all risk categories simultaneously. This approach can 
capture possible compounding effects that are not considered when looking at 
individual risk measures in isolation. For example, unified approaches may consider 
market, credit, and operational risks all together. 


When calculating capital requirements, banks use a compartmentalized approach, 
whereby capital requirements are calculated for individual risk types, such as market 
risk and credit risk. These stand-alone capital requirements are then summed in order 
to obtain the bank’s overall level of capital. 


The Basel regulatory framework uses a “building block” approach, whereby a bank’s 
regulatory capital requirement is the sum of the capital requirements for various risk 
categories. Pillar 1 risk categories include market, credit, and operational risks. Pillar 2 
risk categories incorporate concentration risks, stress tests, and other risks, such as 
liquidity, residual, and business risks. 


Thus, the overall Basel approach to calculating capital requirements is a non-integrated 
approach to risk measurement. In contrast, an integrated approach would look at 
capital requirements for each of the risks simultaneously and account for potential risk 
correlations and interactions. Note that simply calculating individual risks and adding 
them together will not necessarily produce an accurate measure of true risk. 


Risk Aggregation 


LO 6.e: Compare the results of research on top-down and bottom-up risk 
aggregation methods. 


A bank’s assets can be viewed as a series of subportfolios consisting of market, credit, 
and operational risk. However, these risk categories are intertwined and at times 
difficult to separate. For example, foreign currency loans will contain both foreign 
exchange risk and credit risk. Thus, interactions among various risk factors should be 
considered. 


The top-down approach to risk aggregation assumes that a bank’s portfolio can be 
cleanly subdivided according to market, credit, and operational risk measures. In 
contrast, a bottom-up approach attempts to account for interactions among various 
risk factors. 


In order to assess which approach is more appropriate, academic studies calculate the 
ratio of unified capital to compartmentalized capital (i.e., the ratio of integrated risks to 
separate risks). Top-down studies calculate this ratio to be less than one, which suggest 


that risk diversification is present and ignored by the separate approach. Bottom-up 
studies also often calculate this ratio to be less than one, however, this research has not 
been conclusive, and has recently found evidence of risk compounding, which produces 
a ratio greater than one. Thus, bottom-up studies suggest that risk diversification 
should be questioned. 


It is conservative to evaluate market risk and credit risk independently. However, most 
academic studies confirm that market risk and credit risk should be looked at jointly. If 
a bank ignores risk interdependencies, a bank’s capital requirement will be measured 
improperly due to the presence of risk diversification. Therefore, separate 
measurement of market risk and credit risk most likely provides an upper bound on the 
integrated capital level. 


Note that if a bank is unable to completely separate risks, the compartmentalized 
approach will not be conservative enough. Thus, the lack of complete separation could 
lead to an underestimation of risk. In this case, bank managers and regulators should 
conclude that the bank’s overall capital level should be higher than the sum of the 
capital calculations derived from risks individually. 


Balance Sheet Management 


LO 6.f: Describe the relationship between leverage, market value of asset, and 
VaR within an active balance sheet management framework. 


When a balance sheet is actively managed, the amount of leverage on the balance sheet 
becomes procyclical. This results because changes in market prices and risks force 
changes to risk models and capital requirements, which require adjustments to the 
balance sheet (i-e., greater risks require greater levels of capital). Thus, capital 
requirements tend to amplify boom and bust cycles (i.e., magnify financial and 
economic fluctuations). Academic studies have shown that balance sheet adjustments 
made through active risk management affect risk premiums and total financial market 
volatility. 


Leverage (measured as total assets to equity) is inversely related to the market value of 
total assets. When net worth rises, leverage decreases, and when net worth declines, 
leverage increases. This results in a cyclical feedback loop: asset purchases increase 
when asset prices are rising, and assets are sold when asset prices are declining. 


Value at risk is tied to a bank’s level of economic capital. Given a target ratio of VaR to 
economic capital, a VaR constraint on leveraged investors can be established. An 
economic boom will relax this VaR constraint since a bank’s level of equity is 
expanding. Thus, this expansion allows financial institutions to take on more risk and 
further increase debt. In contrast, an economic bust will tighten the VaR constraint and 
force investors to reduce leverage by selling assets when market prices and liquidity 
are declining. Therefore, despite increasingly sophisticated VaR models, current 
regulations intended to limit risk-taking have the potential to actually increase risk in 
financial markets. 


=) MODULE QUIZ 6.1 
—* 1. Which of the following statements is considered to be a drawback of the current Basel 
framework for risk measurement? 
A. Risk measurement focuses exclusively on VaR analysis. 
B. The current regulatory system encourages more risk-taking when times are good. 
C. There is not enough focus on a compartmentalized approach to risk assessment. 
D. There is not a feedback loop via the pricing of risk. 


2. What type of liquidity risk is most troublesome for complex trading positions? 
A. Endogenous. 
B. Market-specific. 
C. Exogenous. 
D. Spectral. 


3. Within the framework of risk analysis, which of the following choices would be 
considered most critical when looking at risks within financial institutions? 
A. Computing separate capital requirements for a bank’s trading and banking books. 
B. Proper analysis of stressed VaR. 
C. Persistent use of backtesting. 
D. Consideration of interactions among risk factors. 


4. What is a key weakness of the value at risk (VaR) measure? VaR: 
A. does not consider the severity of losses in the tail of the returns distribution. 
B. is quite difficult to compute. 
C. is subadditive. 
D. has an insufficient amount of backtesting data. 


5. Which of the following statements is not an advantage of spectral risk measures over 
expected shortfall? Spectral risk measures: 


A. consider a manager’s aversion to risk. 

B. are a special case of expected shortfall measures. 

C. have the ability to modify the risk measure to reflect an investor’s specific risk 
aversion. 

D. have better smoothness properties when weighting observations. 


KEY CONCEPTS 


LO 6.a 


The proper time horizon over which VaR is estimated depends on portfolio liquidity 
and the purpose for risk measurement. It is important to incorporate time-varying 
volatility into VaR models, because ignoring this factor could lead to an 
underestimation of risk. Backtesting VaR models is less effective over longer time 
horizons due to portfolio instability. 


LO 6.b 


Exogenous liquidity represents market-specific, average transaction costs. Endogenous 
liquidity is the adjustment for the price effect of liquidating specific positions. 
Endogenous liquidity risk is especially relevant in high-stress market conditions. 


LO 6.c 


VaR estimates the maximum loss that can occur given a specified level of confidence. It 
is a quantitative risk measure used by investment managers as a method to measure 
portfolio market risk. A downside of VaR is that it is not subadditive. 


An alternative risk measure is expected shortfall, which is complex and 
computationally difficult. Spectral risk measures consider the investment manager’s 
aversion to risk. These measures have select advantages over expected shortfall. 


LO 6.d 


Within a bank’s risk assessment framework, a compartmentalized approach sums 
measured risks separately. A unified approach considers the interaction among various 
risk factors. Simply calculating individual risks and adding them together is not 
necessarily an accurate measure of true risk due to risk diversification. The Basel 
approach is a non-integrated approach to risk measurement. 


LO 6.e 


A top-down approach to risk assessment assumes that a bank’s portfolio can be cleanly 
subdivided according to market, credit, and operational risk measures. To better 
account for the interaction among risk factors, a bottom-up approach should be used. 


LO 6.f 


When a balance sheet is actively managed, the amount of leverage on the balance sheet 
becomes procyclical. Leverage is inversely related to the market value of total assets. 
This results in a cyclical feedback loop. Financial institution capital requirements tend 
to amplify boom and bust cycles. 


ANSWER KEY FOR MODULE QUIZ 


Module Quiz 6.1 


1.B Institutions have a tendency to buy more risky assets when prices of assets are 
rising. (LO 6.f) 


2.A Endogenous liquidity risk is especially relevant for complex trading positions. (LO 
6.b) 


3.D A unified approach is not used within the Basel framework, so the interaction 
among various risk factors is not considered when computing capital 
requirements for market, credit, and operational risk; however, these interactions 
should be considered due to risk diversification. (LO 6.d) 


4.A VaR does not consider losses beyond the VaR threshold level. (LO 6.c) 


5.B Spectral risk measures consider aversion to risk and offer better smoothness 
properties. Expected shortfall is a special case of spectral risk measures. (LO 6.c) 


The following is a review of the Market Risk Measurement and Management principles designed to address the 
learning objectives set forth by GARP®. Cross-reference to GARP assigned reading—Meissner, Chapter 1. 
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Study Session 2 


EXAM FOCUS 


This reading focuses on the role correlation plays as an input for quantifying risk in 
multiple areas of finance. We will explain how correlation changes the value and risk of 
structured products such as credit default swaps (CDSs), collateralized debt obligations 
(CDOs), multi-asset correlation options, and correlation swaps. For the exam, 
understand how correlation risk is related to market risk, systemic risk, credit risk, and 
concentration ratios, and be familiar with how changes in correlation impact implied 
volatility, the value of structured products, and default probabilities. Also, be prepared 
to discuss how the misunderstanding of correlation contributed to the financial crisis 
of 2007 to 2009. 


MODULE 7.1: FINANCIAL CORRELATION RISK 


LO 7.a: Describe financial correlation risk and the areas in which it appears in 
finance. 


Correlation risk measures the risk of financial loss resulting from adverse changes in 
correlations between financial or nonfinancial assets. An example of financial 
correlation risk is the negative correlation between interest rates and commodity 
prices. If interest rates rise, losses occur in commodity investments. Another example 
of this risk occurred during the 2012 Greek crisis. The positive correlation between 
Mexican bonds and Greek bonds caused losses for investors of Mexican bonds. 


The financial crisis beginning in 2007 illustrated how financial correlation risk can 
impact global markets. During this time period, correlations across global markets 
became highly correlated. Assets that previously had very low or negative correlations 
suddenly become very highly positively correlated and fell in value together. 


Nonfinancial assets can also be impacted by correlation risk. For example, the 
correlation of sovereign debt levels and currency values can result in financial losses 
for exporters. In 2012, U.S. exporters experienced losses due to the devaluation of the 
euro. Similarly, a low gross domestic product (GDP) for the United States has major 
adverse impacts on Asian and European exporters who rely heavily on the U.S. market. 
Another nonfinancial example is related to political events, such as uprisings in the 
Middle East that cause airline travel to decrease due to rising oil prices. 


Financial correlations can be categorized as static or dynamic. Static financial 
correlations do not change and measure the relationship between assets for a specific 
time period. Examples of static correlation measures are value at risk (VaR), copula 
correlations for collateralized debt obligations (CDOs), and the binomial default 
correlation model. Dynamic financial correlations measure the comovement of 
assets over time. Examples of dynamic financial correlations are pairs trading, 
deterministic correlation approaches, and stochastic correlation processes. 


Structured products are becoming an increasing area of concern regarding correlation 
risk. The following example demonstrates the role correlation risk plays in credit 
default swaps (CDSs). A CDS transfers credit risk from the investor (CDS buyer) to a 
counterparty (CDS seller). 


Suppose an investor purchases $1 million of French bonds and is concerned about 
France defaulting. The investor (CDS buyer) can transfer the default risk to a 
counterparty (CDS seller). Figure 7.1 illustrates the process for an investor transferring 
credit default risk by purchasing a CDS from Deutsche Bank (a large European bank). 


Figure 7.1: CDS Buyer Hedging Risk in Foreign Bonds 
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Assume the recovery rate is zero with no accrued interest in the event of default. The 
investor (CDS buyer) is protected if France defaults because the investor receives a $1 
million payment from Deutsche Bank. The fixed CDS spread is valued based on the 
default probability of the reference asset (French Bond) and the joint default 
correlation of Deutsche Bank and France. A paper loss occurs if the correlation risk 
between Deutsche Bank and France increases because the value of the CDS will 
decrease. If Deutsche Bank and France default (worst-case scenario), the investor loses 
the entire $1 million investment. 


If there is positive correlation risk between Deutsche Bank and France, the investor has 
wrong-way risk (WWR). The higher the correlation risk, the lower the CDS spread, s. 
The increasing correlation risk increases the probability that both the French bond 
(reference asset) and Deutsche Bank (counterparty) default. 


The dependencies between the CDS spread, s, and correlation risk may be 
nonmonotonous. This means that the CDS spread may sometimes increase and 
sometimes decrease if correlation risk increases. For example, for a correlation of -1 to 
-0.4, the CDS spread may increase slightly. This is due to the fact that a high negative 
correlation implies either France or Deutsche Bank will default, but not both. If France 
defaults, the $1 million is recovered from Deutsche Bank. If Deutsche Bank defaults, the 
investor loses the value of the CDS spread and the investor will need to repurchase a 
CDS spread to hedge the position. The new CDS spread cost will most likely increase in 
the event that Deutsche Bank defaults or if the credit quality of France decreases. 


There are many areas in finance that have financial correlations. Five common finance 
areas where correlations play an important role are (1) investments, (2) trading, (3) 
risk management, (4) global markets, and (5) regulation. 


Correlations in Financial Investments 


In 1952, Harry Markowitz provided the foundation of modern investment theory by 
demonstrating the role that correlation plays in reducing risk. The portfolio return is 
simply the weighted average of the individual returns where the weights are the 
percentage of investment in each asset. The following equation defines the average 
return (i.e, mean) for a portfolio, up, comprised of assets X and Y. Asset X has a weight 


of wy and an average return of uy, and asset Y has a weight of wy and an average return 
of uy. 
Bp = Wyr * Waly 


The standard deviation of a portfolio is determined by the variances of each asset, the 
weights of each asset, and the covariance between assets. The risk or standard 
deviation (i.e., volatility) for a two-asset portfolio is calculated as follows: 
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Let us review how variances, covariance, and correlation are calculated using the 
following example. Suppose an analyst gathers historical prices for two assets, X and Y, 
and calculates their average returns as illustrated in Figure 7.2. 


Figure 7.2: Prices and Returns for Assets X and Y 


Year X Y Return X Retura Y 
2009 90 150 
2010 120 180 0.3333 0.2000 
2011 105 340 (0.1250) 0.8889 
2012 170 320 0.6190 (0.0588) 
2013 150 360 (0.1176) 0.1250 
2014 270 310 0.8000 (0.1389) 
Average Return 0.3019 0.2032 


The calculations for determining the standard deviations, variances, covariance, and 
correlation for assets X and Y are illustrated in Figure 7.3. 


Figure 7.3: Variances and Covariance for Assets X and Y 


Reta Return (X, — Bx) X 

Year X Y X—mye Y-i ŒQ- Y- Oy) 
2010 0.3333 0.2000 0.0314 (0.0032) 0.0010 0.0000 (0.0001) 
2011 (0.1250) 0.8889 (0.4269) 0.6857 0.1823 0.4701 (0.2927) 
2012 0.6190 (0.0588) 0.3171 (0.2621) 0.1006 0.0687 (0.0831) 
2013 (0.1176) 0.1250 (0.4196) (0.0782) 0.1761 0.0061 0.0328 
2014 0.8000 (0.1389) 0.4981 (0.3421) 0.2481 0.1170 (0.1704) 
Mean 0.3019 0.2032 0.7079 0.6620 (0.5135) 

Variance 0.177 0.1655 (0.1284) 

Standard Deviation 0.4207 0.4068 
Conelation (0.7501) 


Notice that the sixth and seventh columns of Figure 7.3 are used to calculate the 
variance of X and Y, respectively. The deviation from each respective mean is squared to 
calculate the variance for each asset: (X, - uy)? for X and (Y, - uy)? for Y. The sum of the 
deviations is then divided by four (i.e., the number of observations minus one for 


degrees of freedom). For example, the asset X variance is calculated by taking 0.7079 
and dividing by 4 (i.e. n - 1) to get 0.1770. 


Covariance is a measure of how two assets move together over time. The last column 
of Figure 7.3 illustrates that the calculation for covariance is similar to the calculation 
for variance. However, instead of squaring each deviation from the mean, the last 
column multiplies the deviations from the mean for each respective asset together. This 
not only captures the magnitude of movement but also the direction of movement. 
Thus, when asset returns are moving in opposite directions for the same time period, 
the product of their deviations is negative. The following equation defines the 
calculation for covariance. The sum of the products of the deviations from the means is 
-0.5135 in the last column of Figure 7.3. Covariance is calculated as -0.1284 by dividing 
-0.5135 by 4 (i.e„n - 1). 
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In finance, the correlation coefficient is often used to standardize the comovement or 
covariance between assets. The following equation defines the correlation for two 


assets, X and Y, by dividing covariance, coVyy, by the product of the asset standard 
deviations, oyoy. 
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The correlation in this example is -0.7501, which is calculated as: 
0.1284 / (0.4207 x 0.4068) = —0.7501 


In his research, Markowitz emphasized the importance of focusing on risk-adjusted 
returns. The return/risk ratio measures the average return for a portfolio, up, by the 
risk of the portfolio, op. Figure 7.2 provided the average return for X and Y as 0.3019 
and 0.2032, respectively. If we assume the portfolio is equally weighted, the average 
return for the portfolio is 0.2526, the correlation between assets X and Y is -0.7501, and 
the standard deviations for X and Y are 0.4207 and 0.4068, respectively. The standard 
deviation for an equally weighted portfolio is determined using the following 
expression: 


\/(0.52 x 0.42077) + (0.52 x 0.40687) + (2 x 0.5 x 0.5 x —0.1284) 
= 0.02142 = 0.1464 


The return/risk ratio of this equally weighted two-asset portfolio is 1.725 (calculated 
as 0.2526 divided by 0.1464). Figure 7.4 illustrates the relationship of the return/risk 
ratio and correlation. The lower the correlation between the two assets, the higher the 
return/risk ratio. A very high negative correlation (e.g., -0.9) results in a return/risk 
ratio greater than 250%. A very high positive correlation (e.g, +0.9) results in a 
return/risk ratio near 50%. 


Figure 7.4: Relationship of Return/Risk Ratio and Correlation 
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Correlation Options 


LO 7.c: Describe how correlation impacts the price of quanto options as well as 
other multi-asset exotic options. 


Correlation trading strategies involve trading assets that have prices determined by 
the comovement of one or more assets over time. Correlation options have prices that 
are very sensitive to the correlation between two assets and are often referred to as 
multi-asset options. 


A quick review of the common notation for options is helpful. Assume the price of asset 
one and two are noted as S4 and S,, respectively, and that the strike price, K, for a call 
option is the predetermined price an asset can be purchased. Likewise, the strike price, 
K, for a put option is the predetermined price an asset can be sold for. 


The correlation between the two assets S4 and S, is an important factor in determining 
the price of correlation options. Figure 7.5 lists a number of multi-asset correlation 
strategies along with their payoffs. For all of these strategies, a lower correlation 
results in a higher option price. A low correlation is expected to result in one asset 
price going higher while the other is lower. Thus, there is a better chance of a higher 
payout. 


Figure 7.5: Payoffs for Multi-Asset Correlation Strategies 


Correlation Strategies Payoff 
Option on higher of two stocks max(S;, S») 


Call option on maximum of max{0, max(S,, S;) — K] 


two stocks 
Exchange option max(0, S, — S,) 
Spread call option max(0, S, — S, — K) 
Dual-strike call option max(0, S, — K}, S, — K) 
Portfolio of basket options a , 
max È n; Xx S;—K, | , where n; = weight of asset / 
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Another correlation strategy that is not listed in Figure 7.5 is a correlation option on 
the worse of two stocks where the payoff is the minimum of the two stock prices. This 
is the only correlation option where a lower correlation is not desirable because it 
reduces the correlation option price. 


We can better understand the role correlation plays by taking a closer look at the 
valuation of the exchange option. The exchange option has a payoff of max(0, S, - S4). 
The buyer of the option has the right to receive asset 2 and give away asset 1 when the 
option matures. The standard deviation of the exchange option, og, is the implied 
volatility of S, / S4, which is defined as: 


DESE ce SE ae 
Op VIxt Fy— 4COVyy 


Implied volatility is an important determinant of the option’s price. Thus, the exchange 
option price is highly sensitive to the covariance or correlation between the two assets. 
The price of the exchange option is close to zero when the correlation is close to 1 
because the two asset prices move together, and the spread between them does not 
change. The price of the exchange option increases as the correlation between the two 
assets decreases because the spread between the two assets is more likely to be greater. 


The quanto option is another investment strategy using correlation options. It 
protects a domestic investor from foreign currency risk. However, the financial 
institution selling the quanto call does not know how deep in the money the call will be 
or what the exchange rate will be when the option is exercised to convert foreign 
currency to domestic currency. Lower correlations between currencies result in higher 
prices for quanto options. 


EXAMPLE: Quanto option 


Suppose a U.S. investor buys a quanto call to invest in the Nikkei index and protect 
potential gains by setting a fixed currency exchange rate (USD per JPY). How does 

the correlation between the call on the Nikkei index and the exchange rate impact 
the price of the quanto option? 


Answer: 


The U.S. investor buys a quanto call on the Nikkei index that has a fixed exchange 
rate for converting yen to dollars. If the correlation coefficient is positive 
(negative) between the Nikkei index and the yen relative to the dollar, an increasing 
Nikkei index results in an increasing (decreasing) value of the yen. Thus, the lower 
the correlation, the higher the price for the quanto option. If the Nikkei index 
increases and the yen decreases, the financial institution will need more yen to 
convert the profits in yen from the Nikkei investment into dollars. 


©) MODULE QUIZ 7.1 
- 1. Which of the following measures is most likely an example of a dynamic financial 
correlation measure? 
A. Pairs trading. 
B. Value at risk (VaR). 
C. Binomial default correlation model. 


D. Copula correlations for collateralized debt obligations (CDOs). 


MODULE 7.2: CORRELATION SWAPS, RISK 
MANAGEMENT, AND THE GLOBAL FINANCIAL 


CRISIS 


Correlation Swaps 
LO 7.d: Describe the structure, uses, and payoffs of a correlation swap. 


A correlation swap is used to trade a fixed correlation between two or more assets with 
the correlation that actually occurs. The correlation that will actually occur is 
unknown and is referred to as the realized or stochastic correlation. Figure 7.6 
illustrates how a correlation swap is structured. In this example, the party buying a 
correlation swap pays a fixed correlation rate of 30%, and the entity selling a 
correlation receives the fixed correlation of 30%. 


Figure 7.6: Correlation Swap With a Fixed Correlation Rate 


Fixed p = 30% 

Buying Correlation Selling Correlation 
Fixed rate p Payer Fixed rate p Receiver 
Realized p 


The present value of the correlation swap increases for the correlation buyer if the 
realized correlation increases. The following equation calculates the realized 
correlation that actually occurs over the time period of the swap for a portfolio of n 
assets, where p; j is the correlation coefficient: 
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The payoff for the investor buying the correlation swap is calculated as follows: 


notional amount x (Pratized — Pired) 


EXAMPLE: Correlation swap 


Suppose a correlation swap buyer pays a fixed correlation rate of 0.2 with a 
notional value of $1 million for one year for a portfolio of three assets. The realized 
pairwise correlations of the daily log returns [In(S, / S,_,)] at maturity for the three 


assets are p, 1 = 0.6, p31 = 0.2, and p3 2 = 0.04. (Note that for all pairs i > j.) What is 
the correlation swap buyer’s payoff? 


Answer: 
The realized correlation is calculated as: 


2 
Prealized — 323 i (0.6 + 0.2 + 0.04) = 0.28 


The payoff for the correlation swap buyer is then calculated as: 


$1,000,000 x (0.28 - 0.20) = $80,000 


Another example of buying correlation is to buy call options on a stock index (such as 
the Standard & Poor’s 500 Index) and sell call options on individual stocks held within 
the index. If correlation increases between stocks within the index, this causes the 
implied volatility of call options to increase. The increase in price for the index call 
options is expected to be greater than the increase in price for individual stocks that 
have a short call position. 


An investor can also buy correlation by paying fixed in a variance swap on an index and 
receiving fixed on individual securities within the index. An increase in correlation for 
securities within the index causes the variance to increase. An increase in variance 
causes the present value of the position to increase for the fixed variance swap payer 
(i.e. variance swap buyer). 


Relationship Between VaR and Correlation 


LO 7.e: Estimate the impact of different correlations between assets in the 
trading book on the VaR capital charge. 


The primary goal of risk management is to mitigate financial risk in the form of market 
risk, credit risk, and operational risk. A common risk management tool used to measure 
market risk is value at risk (VaR). VaR for a portfolio measures the potential loss in 
value for a specific time period for a given confidence level. The formula for calculating 
VaR using the variance-covariance method (a.k.a. delta-normal method) is shown as 
follows: 


VaRp= cpa VX 


In this equation, op is the daily volatility of the portfolio, a is the z-value from the 


standard normal distribution for a specific confidence level, and x is the number of 
trading days. The volatility of the portfolio, op, includes a measurement of correlation 


for assets within the portfolio defined as: 


op= Bax Cx, 

where: 

A, = horizontal 3 vector of investment amount 
C = covariance matrix of returns 

3 = vertical 3 vector of investment amount 


EXAMPLE: Computing VaR with the variance-covariance method 


Assume you have a two-asset portfolio with $8 million in asset A and $4 million in 
asset B. The portfolio correlation is 0.6, and the daily standard deviation of returns 
for assets A and B are 1.5% and 2%, respectively. What is the 10-day VaR of this 
portfolio at a 99% confidence level (i.e. a = 2.33)? 


Answer: 


The first step in solving for the 10-day VaR requires constructing the covariance 
matrix. 


= 0.0152 = 0.000225 
cov», = o3 = 0.027 = 0.0004 
COV1> = P12 X51 X02 = 0.6 0.015 x 0.02 = 0.00018 
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Thus, the covariance matrix, C, can be represented as: 
ean a) a eee apo 
west | Ai » 0.00018 0.0004 


Next, the standard deviation of the portfolio, op, is determined by first solving for 
Bn x C, then solving for (Bp x C) x By, and finally taking the square root of the 
second step. 


Step 1: Compute fp x C: 


0.000225 0.00018 
5 af 0.00018 0.0004 ) 
= [(8 x 0.000225) + (4 x 0.00018) (8 x 0.00018) + (4 x 0.0004)] 


= [0.00252 0.00304] 
Step 2: Compute (Bh x C) x By: 


[0.00252 0.00304] H 
= (0.00252 x 8) + (0.00304 x 4) = 0.03232 


Step 3: Compute op: 


op = 8, xC xp, = 0.03232 = 0.1798 or 17.98% 


The 10-day portfolio VaR (in millions) at the 99% confidence level is then 
computed as: 


VaRp = opa VX = 0.1798 x 2.33 x V10 = 1.3248 


This suggests that the loss will only exceed $1,324,800 once every 100 10-day 
periods. This is approximately once every 1,000 trading days or once every four 
years assuming there are 250 trading days in a year. 


Figure 7.7 illustrates the relationship between correlation and VaR for the previous 
two-asset portfolio example. The VaR for the portfolio increases as the correlation 
between the two assets increases. 


Figure 7.7: Relationship Between VaR and Correlation for Two-Asset Portfolio 
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The Basel Committee on Banking Supervision (BCBS) requires banks to hold capital 
based on the VaR for their portfolios. The BCBS requires banks to hold capital for assets 
in the trading book of at least three times greater than 10-day VaR. The trading book 
includes assets that are marked to market, such as stocks, futures, options, and swaps. 
The bank in the previous example would be required by the Basel Committee to hold 
capital of: 


$1,324,800 x 3 = $3,974,400 


Correlations During the Global Financial Crisis 


LO 7.b: Explain how correlation contributed to the global financial crisis of 
2007-2009. 


The correlations of assets within and across different sectors and geographical regions 
were a major contributing factor for the financial crisis of 2007 to 2009. The economic 
environment, risk attitude, new derivative products, and new copula correlation 
models all contributed to the crisis. 


Investors became more risk averse shortly after the internet bubble that began in the 
1990s. The economy and risk environment was recovering with low credit spreads, low 
interest rates, and low volatility. The overly optimistic housing market led individuals 
to take on more debt on overvalued properties. New structured products known as 
collateralized debt obligations (CDOs), constant-proportion debt obligations (CPDOs), 
and credit default swaps (CDSs) helped encourage more speculation in real estate 
investments. Rating agencies, risk managers, and regulators overlooked the amount of 
leverage individuals and financial institutions were taking on. All of these contributing 
factors helped set the stage for the financial crisis that would be set off initially by 
defaults in the subprime mortgage market. 


Risk managers, financial institutions, and investors did not understand how to properly 
measure correlation. Risk managers used the newly developed copula correlation 


model for measuring correlation in structured products. It is common for CDOs to 
contain up to 125 assets. The copula correlation model was designed to measure [n x (n 
- 1) / 2] assets in structured products. Thus, risk managers of CDOs needed to estimate 
and manage 7,750 correlations (i.e. 125 x 124 / 2). 


CDOs are separated into several tranches based on the degree of default risk. The 
riskiest tranche is called the equity tranche, and investors in this tranche are typically 
exposed to the first 3% of defaults. The next tranche is referred to as the mezzanine 
tranche where investors are typically exposed to the next 4% of defaults (above 3% to 
7%). The copula correlation model was trusted to monitor the default correlations 
across different tranches. A number of large hedge funds were long the CDO equity 
tranche and short the CDO mezzanine tranche. In other words, potential losses from the 
equity tranche were thought to be hedged with gains from the mezzanine tranche. 
Unfortunately, huge losses lead to bankruptcy filings by several large hedge funds 
because the correlation properties across tranches were not correctly understood. 


Correlation played a Key role in the bond market for U.S. automobile makers and the 
CDO market just prior to the financial crisis. A junk bond rating typically leads to 
major price decreases as pension funds, insurance companies, and other large financial 
institutions sell their holdings and are not allowed to hold non-investment grade bonds. 
Bonds within specific credit quality levels typically are more highly correlated. Bonds 
across credit quality levels typically have lower correlations. 


Rating agencies downgraded General Motors and Ford to junk bond status in May of 
2005. Following the change in bond ratings for Ford and General Motors, the equity 
tranche spread increased dramatically. This caused losses for hedge funds that were 
long the equity tranche (i.e. the initial spread received became lower than the market 
spread). At the same time, the correlations decreased for CDOs of investment grade 
bonds. The lower correlations in the mezzanine tranche lowered the mezzanine tranche 
spread, which led to losses for hedge funds that were short the mezzanine tranche (i.e. 
the initial spread paid became higher than the market spread). 


The CDO market, composed primarily of residential mortgages, increased from $64 
billion in 2003 to $455 billion in 2006. Liberal lending policies combined with 
overvalued real estate created the perfect storm in the subprime mortgage market. 
Housing prices became stagnate in 2006 leading to the first string of mortgage defaults. 
In 2007, the real estate market collapsed as the number of mortgage defaults increased. 
The CDO market, which was linked closely to mortgages, collapsed as well. This led toa 
global crisis as stock and commodities markets collapsed around the world. As a result, 
correlations in stock markets increased as the U.S. stock market crashed. Default 
correlations in CDO markets and bond markets also increased as the value of real estate 
and financial stability of individuals and institutions was highly questionable. 


The CDO equity tranche spread typically decreases when default correlations increase. 
A lower equity tranche spread typically leads to an increase in value of the equity 
tranche. Unfortunately, the probability of default in the subprime market increased so 
dramatically in 2007 that it lowered the value of all CDO tranches. Thus, the default 
correlations across CDO tranches increased. The default rates also increased 
dramatically for all residential mortgages. Even the highest quality CDO tranches with 


AAA ratings lost 20% of their value as they were no longer protected from the lower 
tranches. The losses were even greater for many institutions with excess leverage in the 
senior tranches that were thought to be safe havens. The leverage in the CDO market 
caused risk exposures for investors to be 10 to 20 times higher than the investments. 


In addition to the rapid growth in the CDO market, the credit default swap (CDS) 
market grew from $8 trillion to $60 trillion during the 2004 to 2007 time period. As 
mentioned earlier, CDSs are used to hedge default risk. CDSs are similar to insurance 
products as the risk exposure in the debt market is transferred to a broader market. The 
CDS seller must be financially stable enough to protect against losses. The global 
financial crisis revealed that American International Group (AIG) was overextended, 
selling $500 billion in CDSs with little reinsurance. Also, Lehman Brothers had leverage 
30.7 times greater than equity in September 2008 leading to its bankruptcy. However, 
the leverage was much higher considering the large number of derivatives transactions 
that were also held with 8,000 different counterparties. 


Regulators are in the process of developing Basel III in response to the financial crisis. 
New standards for liquidity and leverage ratios for financial institutions are also being 
implemented. New correlation models are being developed and implemented such as 
the Gaussian copula, credit value adjustment (CVA) for correlations in derivatives 
transactions, and wrong-way risk (WWR) correlation. These new models hope to 
address correlated defaults in multi-asset portfolios. 


2) MODULE QUIZ 7.2 


1. Suppose an individual buys a correlation swap with a fixed correlation of 0.2 anda 
notional value of $1 million for one year. The realized pairwise correlations of the 
daily log returns at maturity for three assets are pı = 0.7, p3 ı = 0.2, and ps3 9 = 0.3. 
What is the correlation swap buyer’s payoff at maturity? 

A. $100,000. 
B. $200,000. 
C. $300,000. 
D. $400,000. 


2. Suppose a financial institution has a two-asset portfolio with $7 million in asset A and 
$5 million in asset B. The portfolio correlation is 0.4, and the daily standard deviation 
of returns for asset A and B are 2% and 1%, respectively. What is the 10-day value at 
risk (VaR) of this portfolio at a 99% confidence level (a = 2.33)? 

A. $1.226 million. 
B. $1.670 million. 
C. $2.810 million. 
D. $3.243 million. 


3. In May of 2005, several large hedge funds had speculative positions in the 
collateralized debt obligations (CDOs) tranches. These hedge funds were forced into 
bankruptcy due to the lack of understanding of correlations across tranches. Which of 
the following statements best describe the positions held by hedge funds at this time 
and the role of changing correlations? Hedge funds held a: 

A. long equity tranche and short mezzanine tranche when the correlations in both 
tranches decreased. 

B. short equity tranche and long mezzanine tranche when the correlations in both 
tranches increased. 


C. short senior tranche and long mezzanine tranche when the correlation in the 
mezzanine tranche increased. 

D. long mezzanine tranche and short equity tranche when the correlation in the 
mezzanine tranche decreased. 


MODULE 7.3: THE ROLE OF CORRELATION RISK IN 
OTHER TYPES OF RISK 


LO 7.f: Explain the role of correlation risk in market risk and credit risk. 


LO 7.g: Relate correlation risk to systemic and concentration risk. 


A major concern for risk managers is the relationship between correlation risk and 
other types of risk such as market, credit, systemic, and concentration risk. Examples of 
major factors contributing to market risk are interest rate risk, currency risk, equity 
price risk, and commodity risk. As discussed earlier, risk managers typically measure 
market risk in terms of VaR. Because the covariance matrix of assets is an important 
input of VaR, correlation risk is extremely important. Another important risk 
management tool used to quantify market risk is expected shortfall (ES). Expected 
shortfall measures the impact of market risk for extreme events or tail risk. Given that 
correlation risk refers to the risk that the correlation between assets changes over 
time, the concern is how the covariance matrix used for calculating VaR or ES changes 
over time due to changes in market risk. 


Risk managers are also concerned with measuring credit risk with respect to migration 
risk and default risk. Migration risk is the risk that the quality of a debtor decreases 
following the lowering of quality ratings. Lower debt quality ratings imply higher 
default probabilities. When a debt rating decreases, the present value of the underlying 
asset decreases, which creates a paper loss. As discussed previously, correlation risk 
between a reference asset and counterparty (CDS seller) is an important concern for 
investors. A higher correlation increases the probability of total loss of an investment. 


Financial institutions such as mortgage companies and banks provide a variety of loans 
to individuals and entities. Default correlation is of critical importance to financial 
institutions in quantifying the degree that defaults occur at the same time. A lower 
default correlation is associated with greater diversification of credit risk. Empirical 
studies have examined historical default correlations across and within industries. 
Most default correlations across industries are positive with the exception of the 
energy sector. The energy sector has little or no correlation with other sectors and is, 
therefore, more resistant to recessions. 


Historical data suggests that default correlations are higher within industries. This 
finding implies that systematic factors impacting the overall market and credit risk 
have much more influence in defaults than individual or company-specific factors. For 
example, if Chrysler defaults, then Ford and General Motors are more likely to default 
and have losses rather than benefit from increased market share. Thus, commercial 
banks limit exposures within a specific industry. The key point is that creditors benefit 
by diversifying exposure across industries to lower the default correlations of debtors. 


Risk managers can also use a term structure of defaults to analyze credit risk. Rating 
agencies such as Moody’s provide default probabilities based on bond ratings and time 
to maturity as illustrated in Figure 7.8. 


Figure 7.8: Default Term Structure for A- and CC-Rated Bonds 
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Notice in Figure 7.8 that the default term structure increases slightly with time to 
maturity for most investment grade bonds (solid line). This is expected because bonds 
are more likely to default as many market or company factors can change over a longer 
time period. Conversely, for non-investment grade bonds (dashed line), the probability 
of default is higher in the immediate time horizon. If the company survives the near- 
term distressed situation, the probability of default decreases over time. 


Lehman Brothers filed for bankruptcy in September of 2008. This bankruptcy event was 
an important signal of the severity of the financial crisis and the level of systemic risk. 
Systemic risk refers to the potential risk of a collapse of the entire financial system. It 
is interesting to examine the extent of the stock market crash that began in October 
2007. From October 2007 to March 2009, the Dow Jones Industrial Average fell over 
50% and only 11 stocks increased in the entire Standard & Poor’s 500 Index (S&P 500). 
The decrease in value of 489 stocks in the S&P 500 during this time period reflected 
how a systemic financial crisis impacts the economy with decreasing disposable 
income for individuals, decreasing GDP, and increasing unemployment. 


The sectors represented in the 11 increasing stocks were consumer stables (Family 
Dollar, Ross Stores, and Walmart), educational (Apollo Group and DeVry Inc.), 
pharmaceuticals (Edward Lifesciences and Gilead Pharmaceuticals), agricultural (CF 
Industries), entertainment (Netflix), energy (Southwestern Energy), and automotive 
(AutoZone). The consumer staples and pharmaceutical sector are often recession 
resistant as individuals continue to need basic necessities such as food, household 
supplies, and medications. The educational sector is also resilient as more unemployed 
workers go back to school for education and career changes. 


Studies examined the relationship between the correlations of stocks in the U.S. stock 
market and the overall market during the 2007 crisis. From August of 2008 to March of 
2009, there was a freefall in the U.S. equity market. During this same time period, 
correlations of stocks with each other increased dramatically from a pre-crisis average 
correlation level of 27% to over 50%. Thus, when diversification was needed most 
during the financial crisis, almost all stocks become more highly correlated and, 
therefore, less diversified. The severity of correlation risk is even greater during a 
systemic crisis when one considers the higher correlations of U.S. equities with bonds 
and international equities. 


Concentration risk is the financial loss that arises from the exposure to multiple 
counterparties for a specific group. Concentration risk is measured by the 
concentration ratio. A lower (higher) concentration ratio reflects that the creditor has 
more (less) diversified default risk. For example, the concentration ratio for a creditor 
with 100 loans of equal size to different entities is 0.01 (= 1 / 100). If a creditor has one 
loan to one entity, the concentration ratio for the creditor is 1.0 (= 1 / 1). Loans can be 
further analyzed by grouping them into different sectors. If loan defaults are more 
highly correlated within sectors, when one loan defaults within a specific sector, it is 
more likely that another loan within the same sector will also default. The following 
examples illustrate the relationship between concentration risk and correlation risk. 


EXAMPLE: Concentration ratio for Bank X and one loan to Company A 


Suppose Commercial Bank X makes a $5 million loan to Company A, which has a 
5% default probability. What is the concentration ratio and expected loss (EL) for 
Commercial Bank X under the worst-case scenario? Assume loss given default 
(LGD) is 100%. 


Answer: 


Commercial Bank X has a concentration ratio of 1.0 because there is only one loan. 
The worst-case scenario is that Company A defaults resulting in a total loss of loan 
value. Given that there is a 5% probability that Company A defaults, EL for 
Commercial Bank X is $250,000 (= 0.05 x 5,000,000). 


EXAMPLE: Concentration ratio for Bank Y and two loans to Companies A and 
B 


Suppose Commercial Bank Y makes a $2,500,000 loan to Company A anda 
$2,500,000 loan to Company B. Assuming Companies A and B each have a5% 
default probability, what is the concentration ratio and expected loss (EL) for 
Commercial Bank Y under the worst-case scenario? Assume default correlation 
between companies is 1.0 and loss given default (LGD) is 100%. 


Answer: 


Commercial Bank Y has a concentration ratio of 0.5 (calculated as 1 / 2). The 
expected loss for Commercial Bank Y depends on the default correlation of 
Companies A and B. Note that changes in the concentration ratio are directly 


related to changes in the default correlations. A decrease in the concentration ratio 
results in a decrease in the default correlation. The default of Companies A and B 
can be expressed as two binomial events with a value of 1 in default and 0 if not in 
default. 


Figure 7.9 illustrates the joint probability that both Companies A and B are in 
default, P(AB). 


Figure 7.9: Joint Probability of Default for Companies A and B 
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The following equation computes the joint probability that both Companies A and 
B are in default at the same time: 


P(AB) = pas PD, (1 PD ,) x PD,(1 — PD) + PD, x PD, 
where: 
PAB = default correlation coefficient for A and B 


~/PD a(l — PD, ) = standard deviation of the binomial event A 


The default probability of Company A is 5%. Thus, the standard deviation for 
Company A is: 


\/0.05(1 — 0.05) = 0.2179 


Company B also has a default probability of 5% and, therefore, will also have a 
standard deviation of 0.2179. We can now calculate the expected loss under the 
worst-case scenario where both Companies A and B are in default. Assuming that 
the default correlation between A and B is 1.0, the joint probability of default is: 


P(AB) = 1.0 \/0.05(0.95) x 0.05(0.95) + 0.05 x 0.05 


= 1.0 \/0.00226 + 0.0025 = 0.05 


If the default correlation between Companies A and B is 1.0, the expected loss for 
Commercial Bank Y is $250,000 (0.05 x $5,000,000). Notice that when the default 
correlation is 1.0, this is the same as making a $5 million loan to one company. 


Now, let’s assume that the default correlation between Companies A and B is 0.5. 
What is the expected loss for Commercial Bank Y? The joint probability of default 
for A and B, assuming a default correlation of 0.5, is: 


P(AB) = 0.5 0.00226 + 0.0025 = 0.02627 


Thus, the expected loss for the worst-case scenario for Commercial Bank Y is: 
EL = 0.02627 x $5,000,000 = $131,350 


If we assume the default correlation coefficient is 0, the joint probability of default 
is 0.0025 and the expected loss for Commercial Bank Y is only $12,500. Thus, a 
lower default correlation results in a lower expected loss under the worst-case 
scenario. 


EXAMPLE: Concentration ratio for Bank Z and three loans to Companies A, B, 
and C 


Now we can examine what happens to the joint probability of default (i.e., the 
worst-case scenario) if the concentration ratio is reduced further. Suppose that 
Commercial Bank Z makes three $1,666,667 loans to Companies A, B, and C. Also 
assume the default probability for each company is 5%. What is the concentration 
ratio for Commercial Bank Z, and how will the joint probability be impacted? 


Answer: 


Commercial Bank Z has a concentration ratio of 0.333 (calculated as 1 / 3). Figure 
7.10 illustrates the joint probability of all three loans defaulting at the same time, 
P(ABC) (i.e., the small area in the center of Figure 7.10 where all three default 
probabilities overlap). Note that as the concentration ratio decreases, the joint 
probability also decreases. 


Figure 7.10: Joint Probability of Default for Companies A, B, and C 
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CT PROFESSOR’S NOTE 

s The assigned reading did not cover the calculation of the joint probability for 
three binomial events occurring. The focus here is on understanding that as 
the concentration ratio decreases, the probability of the worst-case scenario 


also decreases. Both a lower concentration ratio and lower correlation 
coefficient reduce the joint probability of default. 


2) MODULE QUIZ 7.3 


1. Suppose a creditor makes a $4 million loan to Company X and a $4 million loan to 
Company Y. Based on historical information of companies in this industry, Companies 
X and Y each have a 7% default probability and a default correlation coefficient of 0.6. 
The expected loss for this creditor under the worst-case scenario assuming loss given 
default is 100% is closest to: 


A. $280,150. 
B. $351,680. 
C. $439,600. 
D. $560,430. 


2. The relationship of correlation risk to credit risk is an important area of concern for 
risk managers. Which of the following statements regarding default probabilities and 
default correlations is incorrect? 


A. Creditors benefit by diversifying exposure across industries to lower the default 
correlations of debtors. 

B. The default term structure increases with time to maturity for most investment 
grade bonds. 

C. The probability of default is higher in the long-term time horizon for non- 
investment grade bonds. 


D. Changes in the concentration ratio are directly related to changes in default 
correlations. 


KEY CONCEPTS 


LO 7.a 


Correlation risk measures the risk of financial loss resulting from adverse changes in 
correlations between financial or nonfinancial assets. For example, financial correlation 
risk can result from the negative correlation between interest rates and commodity 
prices. For almost all correlation option strategies, a lower correlation results ina 
higher option price. 


LO 7.b 


In May of 2005, several large hedge funds had losses on both sides of a hedged position 
long the collateralized debt obligation (CDO) equity tranche and short the CDO 
mezzanine tranche. Both positions resulted in paper losses when equity tranche 
spreads increased and mezzanine tranche spreads decreased. 


American International Group (AIG) and Lehman Brothers were highly leveraged in 
credit default swaps (CDSs) during the financial crisis of 2007-2009. Their financial 
troubles revealed the impact of increasing default correlations with tremendous 
leverage. 


LO 7.c 


Correlation trading strategies involve trading assets that have prices determined by the 
comovement of one or more assets over time. Correlation options have prices that are 


very sensitive to the correlation between two assets and are often referred to as multi- 
asset (exotic) options. The quanto option is another investment strategy using 
correlation options, which protects a domestic investor from foreign currency risk. 


LO 7.d 


A correlation swap is used to trade a fixed correlation between two assets with the 
realized correlation. The payoff for the investor buying the correlation swap is: 


notional amount X (p__s:-.3 — Paxed) 


where: 
2 +} 
p 7. 5 p 
realized n?—n 3; ij 
LO 7.e 


Value at risk (VaR) for a portfolio measures the potential loss in value for a specific 
time period for a given confidence level: 


VaRp = Tpa VI 


The VaR for a portfolio increases as the correlation between assets increase. The Basel 
Committee on Banking Supervision requires banks to hold capital for assets in the 
trading book of at least three times greater than 10-day VaR (i.e., VaR capital charge = 3 
x 10-day VaR). 


LO 7.f 


The covariance matrix of assets is an important input for value at risk (VaR) and 
expected shortfall (ES). These risk management tools are sensitive to changes in 
correlation. 


A lower default correlation is associated with greater diversification of credit risk. 
Creditors benefit by diversifying exposure across industries to lower the default 
correlations of debtors. The default term structure increases with time to maturity for 
most investment grade bonds. The probability of default is higher in the immediate 
time horizon for non-investment grade bonds. 


LO 7.g 


Systemic risk refers to the potential risk of a collapse of the entire financial system. The 
severity of correlation risk is even greater during a systemic crisis considering the 
higher correlations of U.S. equities with bonds and international equities. 


Changes in the concentration risk, which is measured by the concentration ratio, are 
directly related to changes in default correlations. A lower concentration ratio and 
lower correlation coefficient both reduce the joint probability of default. 


ANSWER KEY FOR MODULE QUIZZES 


Module Quiz 7.1 


1.A Dynamic financial correlations measure the comovement of assets over time. 
Examples of dynamic financial correlations are pairs trading, deterministic 
correlation approaches, and stochastic correlation processes. The other choices 
are examples of static financial correlations. (LO 7.a) 


Module Quiz 7.2 
1.B First, calculate the realized correlation as follows: 


2 
Pes 32 _3 x (0.7 + 0.2 + 0.3) = 0.4 


The payoff for the correlation buyer is then calculated as: 
$1,000,000 x (0.4 - 0.2) = $200,000 


(LO 7.d) 
2.A The first step in solving for the 10-day VaR requires calculating the covariance 
matrix. 
COVs; = a? = 0.027 = 0.0004 
cov» = o3= 0.017= 0.0001 


COV1> = P42 x 1 x c3 = 0.4 x 0x 0.01 = 0.00008 


Thus, the covariance matrix, C, can be represented as: 
( 0.0004 0.00008 
0.00008 0.0001 


Next, the standard deviation of the portfolio, op, is determined as follows: 


Step 1: Compute ßp x C: 


0.0004 0.00008 
[7 5] (o.00008 0.0001 ) 

= [(7 x 0.0004) + (5 x 0.00008) (7 x 0.00008) + (5 x 0.0001)] 
= [0.0032 0.001061 


Step 2: Compute (Bp x C) x By: 


[0.0032 0.00106] [z] 
= (0.0032 x 7) + (0.00106 x 5) = 0.0277 
Step 3: Compute op: 


a= y Jp X C X B, =V0.0277 = 0.1664 or 16.64% 


The 10-day portfolio VaR (in millions) at the 99% confidence level is then 
computed as: 
VaR, = 9,0 VX= 0.1664 x 2.33 x -V10 = $1.226 million 


(LO 7.e) 


3.A A number of large hedge funds were long the CDO equity tranche and short the 
CDO mezzanine tranche. Following the change in bond ratings for Ford and 
General Motors, the equity tranche spread increased. This caused losses on the 


long equity tranche position. At the same time, the mezzanine tranche spread 
decreased, which led to losses on the short mezzanine tranche position. (LO 7.b) 


Module Quiz 7.3 


1.B The worst-case scenario is the joint probability that both loans default at the 
same time. The joint probability of default is computed as: 
P(AB) =0.6 \/0.07(0.93) x 0.07(0.93) + 0.07 x 0.07 
= 0.6 \/0.00424 + 0.0049 = 0.04396 


Thus, the expected loss for the worst-case scenario for the creditor is: 
EL = 0.04396 x $8,000,000 = $351,680 


(LO 7.g) 
2.C The probability of default is higher in the immediate time horizon for non- 


investment grade bonds. The probability of default decreases over time if the 
company survives the near-term distressed situation. (LO 7.f) 


The following is a review of the Market Risk Measurement and Management principles designed to address the 
learning objectives set forth by GARP®. Cross-reference to GARP assigned reading—Meissner, Chapter 2. 


READING 8 


EMPIRICAL PROPERTIES OF 
CORRELATION: HOW DO 
CORRELATIONS BEHAVE IN THE 
REAL WORLD? 


Study Session 2 


EXAM FOCUS 


This reading examines how equity correlations and correlation volatility change during 
different economic states. It also discusses how to use a standard regression model to 
estimate the mean reversion rate and autocorrelation. For the exam, be able to 
calculate the mean reversion rate and be prepared to discuss and contrast the nature of 
correlations and correlation volatility for equity, bond, and default correlations. Also, 
be prepared to discuss the best-fit distribution for these three types of correlation 
distributions. 


MODULE 8.1: EMPIRICAL PROPERTIES OF 
CORRELATION 


Correlations During Different Economic States 


LO 8.a: Describe how equity correlations and correlation volatilities behave 
throughout various economic states. 


The financial crisis of 2007-2009 provided new information on how correlation 
changes during different economic states. From 1972 to 2017, an empirical 
investigation on correlations of the 30 common stocks of the Dow Jones Industrial 
Average (Dow) was conducted. The correlation statistic was used to create a 30 x 30 
correlation matrix for each stock in the Dow every month. This required 900 
correlation calculations (30 x 30 = 900). There were 534 months in the study, so 
480,600 monthly correlations were computed (900 x 534 = 480,600). 


The average correlation values were compared for three states of the U.S. economy 
based on gross domestic product (GDP) growth rates. The state of the economy was 
defined as an expansionary period when GDP was greater than 3.5%, a normal 
economic period when GDP was between 0% and 3.5%, and a recession when there 
were two consecutive quarters of negative growth rates. Based on these definitions, 
from 1972 to 2017 there were six recessions, five expansionary periods, and five normal 
periods. 


The average monthly correlation and correlation volatilities were then compared for 
each state of the economy. Correlation levels during a recession, normal period, and 
expansionary period were 37.0%, 33.0%, and 27.5%, respectively. Thus, as expected, 
correlations were highest during recessions when common stocks in equity markets 
tend to go down together. The low correlation levels during an expansionary period 
suggest common stock valuations are determined more on industry and company- 
specific information rather than macroeconomic factors. 


The correlation volatilities during a recession, normal period, and expansionary period 
were 80.5%, 83.0%, and 71.2%, respectively. These results may seem a little surprising 
at first as one may expect volatilities are highest during a recession. However, there is 
perhaps slightly more uncertainty in a normal economy regarding the overall direction 
of the stock market. In other words, investors expect stocks to go down during a 
recession and up during an expansionary period, but they are less certain of direction 
during normal times, which results in higher correlation volatility. 


LT PROFESSOR’S NOTE 
ê The main lesson from this portion of the study is that risk managers should be 
cognizant of high correlation and correlation volatility levels during 
recessions and times of extreme economic distress when calibrating risk 
management models. 


Mean Reversion and Autocorrelation 


LO 8.b: Calculate a mean reversion rate using standard regression and calculate 
the corresponding autocorrelation. 


Mean reversion implies that over time, variables or returns regress back to the mean 
or average return. Empirical studies reveal evidence that bond values, interest rates, 
credit spreads, stock returns, volatility, and other variables are mean reverting. For 
example, during a recession, demand for capital is low. Therefore, interest rates are 
lowered to encourage investment in the economy. Then, as the economy picks up, 
demand for capital increases and, at some point, interest rates will rise. If interest rates 
are too high, demand for capital decreases and interest rates decrease and approach the 
long-run average. The level of interest rates is also a function of monetary and fiscal 
policy and not just supply and demand levels of capital. 


Mean reversion is statistically defined as a negative relationship between the change in 
a variable over time, S,- S,_,, and the variable in the previous period, S,_;: 


In this equation, S, is the value of the variable at time period t, S,_, is the value of the 


variable in the previous period, and ð is a partial derivative coefficient. Mean reversion 
exists when S,_, increases (decreases) by a small amount causing S,- S,_, to decrease 
(increase) by a small amount. For example, if S,_; increases and is high at time period t 
- 1, then mean reversion causes the next value at S, to reverse and decrease toward the 
long-run average or mean value. The mean reversion rate is the degree of the 


attraction back to the mean and is also referred to as the speed or gravity of mean 
reversion. The mean reversion rate, a, is expressed as follows: 


S — S17 au- S, )At+oce VAt 


If we are only concerned with measuring mean reversion, we can ignore the last term, 
a£ VAt, which is the stochastic part of the equation requiring random samples from a 


distribution over time. By ignoring the last term and assuming At = 1, the mean 
reversion rate equation simplifies to: 


S, S1 = au S1) 


EXAMPLE: Calculating mean reversion 


Suppose mean reversion exists for a variable with a value of 50 at time period t - 1. 
The long-run mean value, u, is 80. What are the expected changes in value of the 
variable over the next period, S, - S,_,, if the mean reversion rate, a, is 0, 0.5, or 1.0? 


Answer: 


If the mean reversion rate is 0, there is no mean reversion and there is no expected 
change. If the mean reversion rate is 0.5, there is a 50% mean reversion and the 
expected change is 15 [i.e., 0.5 x (80 — 50)]. If the mean reversion rate is 1.0, there is 
100% mean reversion and the expected change is 30 [i.e., 1.0 x (80 - 50)]. Thus, a 
stronger or faster mean reversion is expected with a higher mean reversion rate. 


Standard regression analysis is one method used to estimate the mean reversion rate, a. 
We can think of the mean reversion rate equation in terms of a standard regression 
equation (i.e., Y = a + BX) by applying the distributive property to reformulate the right 
side of the equation: 


S, Sy = au aS, 


Thinking of this equation in terms of a standard regression implies the following terms 
in the regression equation: 
S, — S4 = Y; ap = a; and — aS,__, = BX 


A regression is run where $, - S,_; (i.e. the Y variable) is regressed with respect to S,-1 


(i.e. the X variable). Thus, the B coefficient of the regression is equal to the negative of 
the mean reversion rate, a. 


From the 1972 to 2017 study, the data resulted in the following regression equation: 
Y = 0.256 - 0.7903X 


The beta coefficient of -0.7903 implies a mean reversion rate of 79.03%. This is a 
relatively high mean reversion rate. Thus, if there is a large decrease (increase) from the 
mean correlation for one month, the following month is expected to have a large 
increase (decrease) in correlation. 


EXAMPLE: Calculating expected correlation 


Suppose that in October 2022, the average monthly correlation for all Dow stocks 
was 30% and the long-run correlation mean of Dow stocks was 35%. A risk 
manager runs a regression, and the regression output estimates the following 
regression relationship: Y = 0.273 - 0.78X. What is the expected correlation for 
November 2022 given the mean reversion rate estimated in the regression 
analysis? (Solve for S, in the mean reversion rate equation.) 


Answer: 


There is a 5% difference from the October 2022 and long-run mean correlation 
(35% - 30% = 5%). The f coefficient in the regression relationship implies a mean 
reversion rate of 78%. The November 2022 correlation is expected to revert 78% 
of the difference back toward the mean. Thus, the expected correlation for 
November 2022 is 33.9%: 

S, =a- S) a7 Sei 

S, = 0.78(35% — 30%) + 0.3 = 0.339 


Autocorrelation measures the degree that a current variable value is correlated to 
past values. Autocorrelation is often calculated using an autoregressive conditional 
heteroskedasticity (ARCH) model or a generalized autoregressive conditional 
heteroskedasticity (GARCH) model. An alternative approach to measuring 
autocorrelation is running a regression equation. In fact, autocorrelation has the exact 
opposite properties of mean reversion. 


Mean reversion measures the tendency to pull away from the current value back to the 
long-run mean. Autocorrelation instead measures the persistence to pull toward more 
recent historical values. The mean reversion rate in the previous example was 78% for 
Dow stocks. Thus, the autocorrelation for a one-period lag is 22% for the same sample. 
The sum of the mean reversion rate and the one-period autocorrelation rate will always 
equal one (i.e., 78% + 22% = 100%). 
Autocorrelation for a one-period lag is statistically defined as: 

COV(P, Pe 1) 


AC(p, P) = 
Pert afp) x a(p,_ 1) 


The term AC(p, Pt-1) represents the autocorrelation of the correlation from time 
period t and the correlation from time period t - 1. For this example, the p, term can 


represent the correlation matrix for Dow stocks on day t, and the p,_, term can 


represent the correlation matrix for Dow stocks on day t - 1. The covariance between 
the correlation measures, Cov(;, P;_1), is calculated the same way covariance is 


calculated for equity returns. 


This autocorrelation equation was used to calculate the one-period lag autocorrelation 
of Dow stocks for the 1972 to 2017 time period, and the result was 20.97%, which is 
identical to subtracting the mean reversion rate from one. The study also used this 
equation to test autocorrelations for 1- to 10-month lag periods for Dow stocks. The 
highest autocorrelation of 26% was found using a two-month lag, which compares the 
time period t correlation with the t - 2 correlation (two months prior). The 
autocorrelation for longer lags decreased gradually to approximately 10% using a 10- 
month lag. It is common for autocorrelations to decay with longer time period lags. 


LT PROFESSOR’S NOTE 
ê The autocorrelation equation is exactly the same as the correlation 
coefficient. Correlation values for time period t and t - 1 are used to 
determine the autocorrelation between the two correlations. 


Best-Fit Distributions for Correlations 


LO 8.c: Identify the best-fit distribution for equity, bond, and default 
correlations. 


Seventy-seven percent of the correlations between stocks listed on the Dow from 1972 
to 2017 were positive. Three distribution fitting tests were used to determine the best 
fit for equity correlations. Based on the results of the Kolmogorov-Smirnov, Anderson- 
Darling, and chi-squared distribution fitting tests, the Johnson SB distribution (which 
has two shape parameters, one location parameter, and one scale parameter) provided 
the best fit for equity correlations. The Johnson SB distribution best fit was also robust 
with respect to testing different economic states for the time period in question. The 
normal, lognormal, and beta distributions provided a poor fit for equity correlations. 


There were three mild recessions and three severe recessions from 1972 to 2017. The 
time periods for the mild recessions occurred in 1980, 1990 to 1991, and 2001. More 
severe recessions occurred from 1973 to 1974 and from 1981 to 1982. Both of these 
severe recessions were caused by huge increases in oil prices. The most severe 
recession for this time period occurred from 2007 to 2009. The percentage change in 
correlation volatility prior to a recession was negative in every case except for the 
1990 to 1991 recession. This is consistent with the findings discussed earlier where 
correlation volatility is low during expansionary periods that often occur prior to a 
recession. 


An empirical investigation of 7,645 bond correlations found average correlations for 
bonds of 42%. Correlation volatility for bond correlations was 64%. Bond correlations 
were also found to exhibit properties of mean reversion, but the mean reversion rate 
was only 26%. The best-fit distribution for bond correlations was found to be the 


generalized extreme value (GEV) distribution. However, the normal distribution is 
also a good fit for bond correlations. 


A study of 4,655 default probability correlations revealed an average default 
correlation of 30%. Correlation volatility for default probability correlations was 88%. 
The mean reversion rate for default probability correlations was 30%, which is closer 
to the 26% for bond correlations. However, the default probability correlation 
distribution was similar to equity distributions in that the Johnson SB distribution is 
the best fit for both distributions. Figure 8.1 summarizes the findings of the empirical 
correlation analysis. 


Figure 8.1: Empirical Findings for Equity, Bond, and Default Correlations 


Correlation Type Average Correlation Reversion Best-Fit 
Correlation Volatility Rate Distribution 
Equity 35% 80% 78% Johnson SB 
Bond 42% 64% 26% Generalized 
Extreme 
Value 
Default Probability 30% 88% 30% Johnson SB 


©) MODULE QUIZ 8.1 


1. Suppose a risk manager examines the correlations and correlation volatility of stocks 
in the Dow Jones Industrial Average (Dow) for the period beginning in 1972 and 
ending in 2017. Expansionary periods are defined as periods where the U.S. gross 
domestic product (GDP) growth rate is greater than 3.5%, periods are normal when 
the GDP growth rates are between 0 and 3.5%, and recessions are periods with two 
consecutive negative GDP growth rates. Which of the following statements 
characterizes correlation and correlation volatilities for this sample? The risk 
manager will most likely find that: 

A. correlations and correlation volatility are highest for recessions. 

B. correlations and correlation volatility are highest for expansionary periods. 

C. correlations are highest for normal periods, and correlation volatility is highest for 
recessions. 

D. correlations are highest for recessions, and correlation volatility is highest for 
normal periods. 


2. Suppose mean reversion exists for a variable with a value of 30 at time period t — 1. 
Assume that the long-run mean value for this variable is 40 and ignore the stochastic 
term included in most regressions of financial data. What is the expected change in 
value of the variable for the next period if the mean reversion rate is 0.4? 

A. -10. 
B. —4. 
C. 4. 
D. 10. 


3. A risk manager uses the past 480 months of correlation data from the Dow Jones 
Industrial Average (Dow) to estimate the long-run mean correlation of common stocks 
and the mean reversion rate. Based on historical data, the long-run mean correlation 
of Dow stocks was 32%, and the regression output estimates the following regression 
relationship: Y = 0.24 — 0.75X. Suppose that in April 2014, the average monthly 
correlation for all Dow stocks was 36%. What is the expected correlation for May 2014 


assuming the mean reversion rate estimated in the regression analysis? 
A. 32%. 
B. 33%. 
C. 35%. 
D. 37%. 


4. A risk manager uses the past 480 months of correlation data from the Dow Jones 
Industrial Average (Dow) to estimate the long-run mean correlation of common stocks 
and the mean reversion rate. Based on this historical data, the long-run mean 
correlation of Dow stocks was 34%, and the regression output estimates the following 
regression relationship: Y = 0.262 — 0.77X. Suppose that in April 2014, the average 
monthly correlation for all Dow stocks was 33%. What is the estimated one-period 
autocorrelation for this time period based on the mean reversion rate estimated in the 
regression analysis? 

A. 23%. 
B. 26%. 
C. 30%. 
D. 33%. 


5. In estimating correlation matrices, risk managers often assume an underlying 
distribution for the correlations. Which of the following statements most accurately 
describes the best-fit distributions for equity correlation distributions, bond 
correlation distributions, and default probability correlation distributions? The best-fit 
distribution for the equity, bond, and default probability correlation distributions, 
respectively are: 

A. lognormal, generalized extreme value, and normal. 

B. Johnson SB, generalized extreme value, and Johnson SB. 
C. beta, normal, and beta. 

D. Johnson SB, normal, and beta. 


KEY CONCEPTS 


LO 8.a 


Risk managers should be cognizant that historical correlation levels for common 
stocks in the Dow are highest during recessions. Correlation volatility for Dow stocks is 
high during recessions but highest during normal economic periods. 


LO 8.b 
When a regression is run where $,- S,_; (the Y variable) is regressed with respect to 


S,-; (the X variable), the 6 coefficient of the regression is equal to the negative mean 
reversion rate, a. 


Equity correlations show high mean reversion rates (29%) and low autocorrelations 
(21%). These two rates must sum to 100%. Bond correlations and default probability 
correlations show much lower mean reversion rates and higher autocorrelation rates. 


LO 8.c 


Equity correlation distributions and default probability correlation distributions are 
best fit with the Johnson SB distribution. Bond correlation distributions are best fit 


with the generalized extreme value distribution, but the normal distribution is also a 
good fit. 


ANSWER KEY FOR MODULE QUIZ 


Module Quiz 8.1 


1.D Findings of an empirical study of monthly correlations of Dow stocks from 1972 


2.C 


3. B 


4. A 


5.B 


to 2017 revealed the highest correlation levels for recessions and the highest 
correlation volatilities for normal periods. The correlation volatilities during a 
recession and normal period were 80.5% and 83.0%, respectively. (LO 8.a) 


The mean reversion rate, a, indicates the speed of the change or reversion back to 
the mean. If the mean reversion rate is 0.4 and the difference between the last 
variable and long-run mean is 10 (= 40 - 30), the expected change for the next 
period is 4 (i.e, 0.4 x 10 = 4). (LO 8.b) 


There is a -4% difference from the long-run mean correlation and April 2014 
correlation (32% - 36% = -4%). The inverse of the f coefficient in the regression 
relationship implies a mean reversion rate of 75%. Thus, the expected correlation 
for May 2014 is 33.0%: 

S, =a(u — Si) i Se 

S, = 0.75(32% — 36%) + 0.36 = 0.33 


(LO 8b) 


The autocorrelation for a one-period lag is 23% for the same sample. The sum of 
the mean reversion rate (77% given the beta coefficient of -0.77) and the one- 
period autocorrelation rate will always equal 100%. (LO 8.b) 


Equity correlation distributions and default probability correlation distributions 
are best fit with the Johnson SB distribution. Bond correlation distributions are 
best fit with the generalized extreme value distribution. (LO 8.c) 


The following is a review of the Market Risk Measurement and Management principles designed to address the 
learning objectives set forth by GARP®. Cross-reference to GARP assigned reading—Meissner, Chapter 5. 


READING 9 


FINANCIAL CORRELATION 
MODELING—BOTTOM-UP 
APPROACHES 


Study Session 2 


EXAM FOCUS 


A copula is a joint multivariate distribution that describes how variables from marginal 
distributions come together. Copulas provide an alternative measure of dependence 
between random variables that is not subject to the same limitations as correlation in 
applications such as risk measurement. For the exam, understand how a copula 
correlation is created by mapping two or more unknown distributions to a known 
distribution that has well-defined properties. Also, know how the Gaussian copula is 
used to estimate joint probabilities of default for specific time periods and the default 
time for multiple assets. The material in this reading is relatively complex, so your 
focus here should be on gaining a general understanding of how a copula function is 
applied. 


MODULE 9.1: FINANCIAL CORRELATION 
MODELING 


Copula Functions 


LO 9.a: Explain the purpose of copula functions and how they are applied in 
finance. 


A copula correlation is created by converting two or more unknown distributions 
that may have unique shapes and mapping them to a known distribution with well- 
defined properties, such as the normal distribution. A copula creates a joint probability 
distribution between two or more variables while maintaining their individual 
marginal distributions. This is accomplished by mapping multiple distributions to a 
single multivariate distribution. For example, the following expression defines a copula 


function, C, that transforms an n-dimensional function on the interval [0,1] to a one- 
dimensional function. 


C: [0,1]" — [0,1] 


Suppose G; (u;) € [0,1] is a univariate, uniform distribution with u; = U4,.. Up and i € N 
(i.e, 7 is an element of set N). A copula function, C, can then be defined as follows: 


C|G,(u)),..., G,@,)] = F,[F71(G,@)),.--, Fa (Ga (0a); Pr] 


In this equation, G,(u;) are the marginal distributions, F, is the joint cumulative 


distribution function, F,~1 is the inverse function of F, and pp is the correlation matrix 
structure of the joint cumulative function Fp. 


This copula function is translated as follows. Suppose there are n marginal 
distributions, G,(u,) to G,(u,). A copula function exists that maps the marginal 


distributions of G,(u,) to G,(u,) via F, 1G, (u;) and allows for the joining of the separate 
values F,‘G,(u,) to a single n-variate function F [Fy (G 0D) F1(G_(u,))| that has a 
correlation matrix of pp. Thus, this equation defines the process where unknown 


marginal distributions are mapped to a well-known distribution, such as the standard 
multivariate normal distribution. 


Copulas gained popularity in the financial industry around the year 2000 because they 
aimed to use simple methods to solve complex problems. For example, a copula was 
assumed to only need a single, multidimensional, function to correlate assets for a 
collateralized debt obligation (CDO) structure with 100+ assets. However, flexible 
copula functions fell out of favor when the 2007-2009 financial crisis began. 


Gaussian Copula 


LO 9.b: Describe the Gaussian copula and explain how to use it to derive the joint 
probability of default of two assets. 


A Gaussian copula maps the marginal distribution of each variable to the standard 
normal distribution which, by definition, has a mean of zero and a standard deviation of 
one. The key property of a copula correlation model is preserving the original marginal 
distributions while defining a correlation between them. The mapping of each variable 
to the new distribution is done on percentile-to-percentile basis. 


Figure 9.1 illustrates that the variables of two unknown distributions X and Y have 
unique marginal distributions. The observations of the unknown distributions are 
mapped to the standard normal distribution on a percentile-to-percentile basis to 
create a Gaussian copula. 


Figure 9.1: Mapping a Gaussian Copula to the Standard Normal Distribution 


Distribution of X Distribution of Y 
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For example, the 5th percentile observation for marginal distribution X is mapped to 
the 5th percentile point on the univariate standard normal distribution. When the 5th 
percentile is mapped, it will have a value of -1.645. This is repeated for each 
observation on a percentile-to-percentile basis. Likewise, every observation on the 
marginal distribution of Y is mapped to the corresponding percentile on the univariate 
standard normal distribution. The new joint distribution is now a multivariate standard 
normal distribution. 


Now a correlation structure can be defined between the two variables X and Y. The 
unique marginal distributions of X and Y are not well-behaved structures, and therefore, 
it is difficult to define a relationship between the two variables. However, the standard 
normal distribution is a well-behaved distribution. Therefore, a copula is a way to 
indirectly define a correlation relationship between two variables when it is not 
possible to directly define a correlation. 

A Gaussian copula, Cg, is defined in the following expression for an n-variate example. 
The joint standard multivariate normal distribution is denoted as M,,. The inverse of the 
univariate standard normal distribution is denoted as N,!. The notation py denotes 
the n x n correlation matrix for the joint standard multivariate normal distribution M,. 


Cg[G,(0,),...,G,(0,)| = M,[Ny"G4(u,)),.--,Ng“G,(v,)) ; Pr] 


In finance, the Gaussian copula is acommon approach for measuring default risk. The 
approach can be transformed to define the Gaussian default time copula, C¢p, in the 


following expression: 
Cgp[O.t),---,O,(0] = M,[N71(Q,(t)),---sN2'(Q.(t)) 5 Paa 


Marginal distributions of cumulative default probabilities, Q(t), for assets i = 1 to n for 
fixed time periods t are mapped to the single n-variate standard normal distribution M, 


with a correlation structure of py. The term Ni 1(Q, (t)) maps each individual 


cumulative default probability for asset i for time period t on a percentile-to-percentile 
basis to the standard normal distribution. 


EXAMPLE: Applying a Gaussian copula 


Suppose a risk manager owns two non-investment grade assets. Figure 9.2 lists the 
default probabilities for the next five years for Companies B and C that have B and 
C credit ratings, respectively. How can a Gaussian copula be constructed to 
estimate the joint default probability, Q, of these two companies in the next year, 
assuming a one-year Gaussian default correlation of 0.4? 


Figure 9.2: Default Probabilities of Companies B and C 


Time, t B Default Probability C Default Probability 
1 0.065 0.238 
2 0.081 0.152 
3 0.072 0.113 
= 0.064 0.092 
5 0.059 0.072 


S PROFESSOR’S NOTE 
Non-investment grade companies have a higher probability of default 
in the near term during the company crisis state. If the company 
survives past the near-term crisis, the probability of default will go 
down over time. 


Answer: 


In this example, there are only two companies, B and C. Thus, a bivariate standard 
normal distribution, M,, with a default correlation coefficient of p can be applied. 
With two companies, only a single correlation coefficient is required, and not a 
correlation matrix of py. 


CaplQslt) QD] = Mz [N-"Qg(t)),N- (Qt); el 


Figure 9.3 illustrates the percentile-to-percentile mapping of cumulative default 
probabilities for each company to the standard normal distribution. 


Figure 9.3: Mapping Cumulative Default Probabilities to Standard Normal 


Distribution 


Time,t BDefault Q, (t) N (Q) C Defaut Q(t) NHQ) 


Probability Probability 
l 0.065 0.065 BESE 0.238 0.238 —0.712 
2 0.081 0.140 ~1.053 0.152 0.390 —0.279 
3 0.072 0218 _0.779 0.113 0.503 0.008 
4 0.004 0.282 _0.577 0.092 0.595 0.241 
5 0.059 0.341 0.409 0.072 0.667 0 432 


The third and sixth columns represent the cumulative default probabilities Q,(t) 
and Q,(t) for Companies B and C, respectively. The values in the fourth and seventh 
columns map the respective cumulative default probabilities, Q,(t) and Q-(t), to 
the standard normal distribution via N~'(Q(t)). The values for the standard normal 
distribution are determined using the Microsoft Excel® function 


=NORMSINV(Q(t)) or the MATLAB® function =NORMINV(Q(t)). This process was 
illustrated graphically in Figure 9.1. 


The joint probability of both Company B and Company C defaulting within one 
year is calculated as: 
Q(t = 1N tc < 1) = M (Xg £ —-1.513 N Xç < —0.712, p = 0.4) = 3.4% 


S PROFESSOR’S NOTE 
ê You will not be asked to calculate the percentiles for mapping to the standard 
normal distribution because it requires the use of Microsoft Excel® or 


MATLAB®. In addition, you will not be asked to calculate the joint probability 
of default for a bivariate normal distribution due to its complexity. 


Correlated Default Time 


LO 9.c: Summarize the process of finding the default time of an asset correlated 
to all other assets in a portfolio using the Gaussian copula. 


When a Gaussian copula is used to derive the default time relationship for more than 
two assets, a Cholesky decomposition is used to derive a sample M,(¢) froma 


multivariate copula M,(¢) € [0,1]. The default correlations of the sample are 
determined by the default correlation matrix py for the n-variate standard normal 
distribution, M 


The first step is to equate the sample M,,(¢) to the cumulative individual default 
probability, Q, for asset i at time qt using the following equation. This is accomplished 
using Microsoft Excel® or a Newton-Raphson search procedure. 

M,,(°) = QA7;) 


Next, the random samples are repeatedly drawn from the n-variate standard normal 
distribution M,(¢) to determine the expected default time using the Gaussian copula. 


Random samples are drawn to estimate the default times, because there is no closed 
form solution for this equation. 


EXAMPLE: Estimating default time 


Illustrate how a risk manager estimates the expected default time of asset i using 
an n-variate Gaussian copula. 


Answer: 


Suppose a risk manager draws a 25% cumulative default probability for asset i 
from a random n-variate standard normal distribution, M,(¢). The n-variate 


standard normal distribution includes a default correlation matrix, py, that has the 


default correlations of asset i with all n assets. Figure 9.4 illustrates how to equate 
this 25% with the market determined cumulative individual default probability 
Q;(t,). Suppose the first random sample equates to a default time t of 3.5 years. This 


process is then repeated 100,000 times to estimate the default time of asset i. 


Figure 9.4: Mapping Default Time for a Random Sample 
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S MODULE QUIZ 9.1 
— 1. Suppose a risk manager creates a copula function, C, defined by the equation: 
C[G, (Uy) eG, (UR| = Fy [Fy (G; (U4), Fp (Gn (Und Pe] 
Which of the following statements does not accurately describe this copula function? 
A. G;(u;) are standard normal univariate distributions. 
B. F is the joint cumulative distribution function. 
C.F is the inverse function of F, that is used in the mapping process. 


D. pp is the correlation matrix structure of the joint cumulative function F, 


2. Which of the following statements best describes a Gaussian copula? 
A. A major disadvantage of a Gaussian copula model is the transformation of the 
original marginal distributions in order to define the correlation matrix. 


B. The mapping of each variable to the new distribution is done by defining a 
mathematical relationship between marginal and unknown distributions. 

C. A Gaussian copula maps the marginal distribution of each variable to the standard 
normal distribution. 

D. A Gaussian copula is seldom used in financial models because ordinal numbers are 
required. 


3. A Gaussian copula is constructed to estimate the joint default probability of two assets 
within a one-year time period. Which of the following statements regarding this type 
of copula is incorrect? 


A. This copula requires that the respective cumulative default probabilities are 
mapped to a bivariate standard normal distribution. 

B. This copula defines the relationship between the variables using a default 
correlation matrix, py. 

C. The term NT QO) maps each individual cumulative default probability for asset 


i for time period t on a percentile-to-percentile basis. 
D. This copula is a common approach used in finance to estimate joint default 
probabilities. 


4. A risk manager is trying to estimate the default time for asset i based on the default 
copula correlation of asset i to n assets. Which of the following equations best defines 
the process that the risk manager should use to generate and map random samples to 
estimate the default time? 


A. Cep[Qp(t),Qc(t)] = MaN RO), N*Qc(t)) ; pl. 
B. C[G,(u,),...,G,(,)] = Fpa [Fy (G 01). Fn (GaU) ; prl. 
C. CeplQi®,.--.Qn(0] = MyINy QO), -Na aO) ; Pul- 
D. M,C) = Q(t). 
5. Suppose a risk manager owns two non-investment grade assets and has determined 
their individual default probabilities for the next five years. Which of the following 
equations best defines how a Gaussian copula is constructed by the risk manager to 


estimate the joint probability of these two companies defaulting within the next year, 
assuming a Gaussian default correlation of 0.35? 


A. Cep[Qp(t), Qc(t) = MNRE), N4) ; pl. 

B. C[G,(u)),.-.G,(u,)] = Fp [F171 (G4 01), F, (Gp 0n) ; Prl. 
C. Cen [Qi(t),.--.Qn(0)] = M, [N1 RO), Na O) ; Pul- 
D. M,(+) = Q(t). 


KEY CONCEPTS 


LO 9.a 
The general equation for copula correlation, C, is defined as: 


C|G,(u,), -...G,(@@,.)] = F,[F74(G,@y),--.Fo (Gu); Prl 


The notation for this copula equation is translated as: G,(u,) are marginal distributions, 


F, is the joint cumulative distribution function, F1! is the inverse function of F, and pp 
is the correlation matrix structure of the joint cumulative function Fp 


LO 9.b 
The Gaussian default time copula is defined as: 


C100. o, (0) = M,N; 1(9,®), -N2 (0,0); Pal 


Marginal distributions of cumulative default probabilities, Q(t), for assets i = 1 to n for 


fixed time periods t are mapped to the single n-variate standard normal distribution, 
M,, with a correlation structure of py. 


The Gaussian copula for the bivariate standard normal distribution, M,, for two assets 
with a default correlation coefficient of p is defined as: 


CeplOp(t),Qc(t)] = Mz [N-"Q,(t)),N-Qtt)); o] 


LO 9.c 


Random samples are drawn from an n-variate standard normal distribution sample, 
M,,(¢), to estimate expected default times using the Gaussian copula: 


M (= Q; C) 


ANSWER KEY FOR MODULE QUIZ 


Module Quiz 9.1 


1.A G,(u;,) are marginal distributions that do not have well-known distribution 
properties. (LO 9.a) 


2.C Observations of the unknown marginal distributions are mapped to the standard 


normal distribution on a percentile-to-percentile basis to create a Gaussian 
copula. (LO 9.b) 


3.B Because there are only two companies, only a single correlation coefficient is 
required and not a correlation matrix, py. (LO 9.b) 


4.D The equation M,(e) = Q,(t;) is used to repeatedly generate random drawings from 
the n-variate standard normal distribution to determine the expected default 
time using the Gaussian copula. (LO 9.c) 


5.A Because there are only two assets, the risk manager should use this equation to 
define the bivariate standard normal distribution, M», with a single default 


correlation coefficient of p. (LO 9.b) 


The following is a review of the Market Risk Measurement and Management principles designed to address the 
learning objectives set forth by GARP®. Cross-reference to GARP assigned reading—Tuckman and Serrat, Chapter 6. 


READING 10 


EMPIRICAL APPROACHES TO RISK 
METRICS AND HEDGING 


Study Session 3 


EXAM FOCUS 


This reading discusses how dollar value of a basis point (DV01)-style hedges can be 
improved. Regression-based hedges enhance DV01-style hedges by examining yield 
changes over time. Principal components analysis (PCA) greatly simplifies bond 
hedging techniques. For the exam, understand the drawbacks of a standard DV01- 
netural hedge, and know how to compute the face value of an offsetting position using 
DV01 and how to adjust this position using regression-based hedging techniques. 


MODULE 10.1: EMPIRICAL APPROACHES TO RISK 
METRICS AND HEDGING 


DV01-Neutral Hedge 


LO 10.a: Explain the drawbacks to using a DV01-neutral hedge for a bond 
position. 


A standard DV01-neutral hedge assumes that the yield on a bond and the yield ona 
hedging instrument rise and fall by the same number of basis points. However, a one-to- 
one relationship does not always exist in practice. For example, if a trader hedges a T- 
bond which uses a nominal yield with a Treasury security indexed to inflation (i.e. 
Treasury Inflation Protected Security [TIPS]) which uses a real yield, the hedge will 
likely be imprecise when changes in yield occur. In general, more dispersion surrounds 
the change in the nominal yield for a given change in the real yield. Empirically, the 
nominal yield adjusts by more than one basis point for every basis point adjustment in 
the real yield. 


DV01-style metrics and hedges focus on how rates change relative to one another. As 
mentioned, the presumption that yields on nominal bonds and TIPS change by the same 
amount is not very realistic. To improve this DV01-neutral hedge approach, we can 


apply regression analysis techniques. Using a regression hedge examines the volatility 
of historical rate differences and adjusts the DV01 hedge accordingly, based on 
historical volatility. 


Regression Hedge 


LO 10.b: Describe a regression hedge and explain how it can improve a standard 
DV01-neutral hedge. 


A regression hedge takes DV01-style hedges and adjusts them for projected nominal 
yield changes compared to projected real yield changes. Least squares regression 
analysis, which is used for regression-based hedges, looks at the historical relationship 
between real and nominal yields. 


The advantage of a regression framework is that it provides an estimate of a hedged 
portfolio’s volatility. An investor can gauge the expected gain in advance and compare it 
to historical volatility to determine whether the hedged portfolio is an attractive 
investment. 


For example, assume a relative value trade is established whereby a trader sells a U.S. 
Treasury bond and buys a U.S. TIPS (which makes inflation-adjusted payments) to 
hedge the T-bond. The initial spread between these two securities represents the 
current views on inflation. Over time, changes in yields on nominal bonds and TIPS do 
not track one-for-one. To illustrate this hedge, assume the following data for yields and 
DV01s of a TIPS and a T-bond. Also assume that the trader is selling 100 million of the 
T-bond. 


Bond field (%) DV01 
TIPS 1.325 0.084 
T-Bond 3.475 0.068 


If the trade was made DV01-neutral, which assumes that the yield on the TIPS and the 
nominal bond will increase/decrease by the same number of basis points, the trade will 
not earn a profit or sustain a loss. The calculation for the amount of TIPS to purchase to 
hedge the short nominal bond is as follows: 


0.084 0.068 
Ry = X 
p 100 100M 100 
R- „ 0.068 _ 95 
F 100M 0.084 $80.95 million 
where: 


FÈ = face amount of the real yield bond 


To improve this hedge, the trader gathers yield data over time and plots a regression 
line, whereby the real yield is the independent variable and the nominal yield is the 
dependent variable. To compensate for the dispersion in the change in the nominal 
yield for a given change in the real yield, the trader would adjust the DV01-neutral 
hedge. 


Hedge Adjustment Factor 


LO 10.c: Calculate the regression hedge adjustment factor, beta. 


In order to profit from a hedge, we must assume variability in the spread between the 
real and nominal yields over time. As mentioned, least squares regression is conducted 
to analyze these changes. The alpha and beta coefficients of a least squares regression 
line will be determined by the line of best fit through historical yield data points. 

ay =a + GayF +e, 

where: 

Ay S = changes in the nominal yield 


Ay È = changes in the real yield 


Recall that alpha represents the intercept term and beta represents the slope of the 
data plot. If least squares estimation determines the yield beta to be 1.0198, then this 
means that over the sample period, the nominal yield increases by 1.0198 basis points 
for every basis point increase in real yields. 


Face Amout of Offsetting Position 


LO 10.d: Calculate the face value of an offsetting position needed to carry out a 
regression hedge. 


Defining F? and F" as the face amounts of the real and nominal bonds, respectively, and 


their corresponding DV01s as DV01* and DV01", a DVO1 hedge is adjusted by the hedge 
adjustment factor, or beta, as follows: 


p= p (BU 


Now that we have determined the variability between the nominal and real yields, the 
hedge can be adjusted by the hedge adjustment factor of 1.0198: 


0.068 : 
x 1.0198 = $82.55 
FA, 1.0198 = $82.55 million 


FR= 100M x ( 


This regression hedge approach suggests that for every $100 million sold in T-bonds, 
we should buy $82.55 million in TIPS. This will account for hedging not only the size of 
the underlying instrument, but also differences between nominal and real yields over 
time. 


Note that in our example, the beta was close to one, so the resulting regression hedge 
did not change much from the DV01-neutral hedge. The regression hedge approach 
assumes that the hedge coefficient, B, is constant over time. This of course is not always 
the case, so it is best to estimate the coefficient over different time periods and make 
comparisons. 


Two other factors should be also considered in our analysis: (1) the R-squared (i.e., the 
coefficient of determination), and (2) the standard error of the regression (SER). The R- 


squared gives the percentage of variation in nominal yields that is explained by real 
yields. The standard error of the regression is the standard deviation of the realized 
error terms in the regression. 


Two-Variable Regression Hedge 


LO 10.e: Calculate the face value of multiple offsetting swap positions needed to 
carry out a two-variable regression hedge. 


Regression hedging can also be conducted with two independent variables. For 
example, assume a trader in euro interest rate swaps buys/receives the fixed rate ina 
relatively illiquid 20-year swap and wishes to hedge this interest rate exposure. In this 
case, a regression hedge with swaps of different maturities would be appropriate. Since 
it may be impractical to hedge this position by immediately selling 20-year swaps, the 
trader may choose to sell a combination of 10- and 30-year swaps. 


The trader is thus relying on a two-variable regression model to approximate the 
relationship between changes in 20-year swap rates and changes in 10- and 30-year 
swap rates. The following regression equation describes this relationship: 


Ay 20 = a + Bay 10 + B30Ay 30 + £, 


Similar to the single-variable regression hedge, this hedge of the 20-year euro swap can 
be expressed in terms of risk weights, which are the beta coefficients in the equation 
just listed: 


(—F™ x Dvo1!*) 


e eee 310 
(F2° x DV0179) 


= change in 10-year swap rate, 


(—F30 x DV 0130) 


mee a E 0 
(F x DV01”) 


= change in 30-year swap rate, 83 


The trader next does an initial regression analysis using data on changes in the 10-, 20-, 
and 30-year euro swap rates for a five-year time period. Assume the regression output 
is as follows: 


Number of observations 1281 

R-squared 99.8% 

Standard error 0.14 

Regression Coefficients Value Standard Error 
Alpha —0.0014 0.0040 
Change in 10-year swap rate 0.2221 0.0034 
Change in 30-year swap rate 0.7765 0.0037 


Given these regression results and an illiquid 20-year swap, the trader would hedge 
22.21% of the 20-year swap DVO1 with a 10-year swap and 77.65% of the 20-year swap 
DVO01 with a 30-year swap. Because these weights sum to approximately one, the 
regression hedge DV01 will be very close to the 20-year swap DV01. 


The two-variable approach will provide a better hedge (in terms of R-squared) 
compared to a single-variable approach. However, regression hedging is not an exact 
science. There are several cases in which simply doing a one-security DV01 hedge, or a 
two-variable hedge with arbitrary risk weights, is not appropriate (e.g., hedging during 
a financial crisis). 


Level and Change Regressions 


LO 10.f: Compare and contrast level and change regressions. 


When setting up and establishing regression-based hedges, there are two schools of 
thought. Some regress changes in yields on changes in yields, as demonstrated 
previously, but an alternative approach is to regress yields on yields. 


Using a single-variable approach, the formula for a change-on-change regression with 
dependent variable y and independent variable x is as follows: 


Ay, =+ GAX, + AE, 
where: 
AY, =Y, Yes 

rl 


X=Y x 
AX, i ie 


Alternatively, the formula for a level-on-level regression is as follows: 


i +4 +e 
ig OPES 


With both approaches, the estimated regression coefficients are unbiased and 
consistent; however, the error terms are unlikely to be independent of each other. Thus, 
since the error terms are correlated over time (i.e. serially correlated), the estimated 
regression coefficients are not efficient. As a result, there is a third way to model the 
relationship between two bond yields (for some constant correlation < 1): 

Ep PE tV 
This formula assumes that today’s error term consists of some part of yesterday’s error 
term, plus a new random fluctuation. 


Principal Components Analysis 


LO 10.g: Describe principal component analysis and explain how it is applied to 
constructing a hedging portfolio. 


Regression analysis focuses on yield changes among a small number of bonds. 
Empirical approaches, such as principal components analysis (PCA), take a different 
approach by providing a single empirical description of term structure behavior, which 
can be applied across all bonds. PCA attempts to explain all factor exposures using a 
small number of uncorrelated exposures which do an adequate job of capturing risk. 


For example, if we consider the set of swap rates from 1 to 30 years, at annual 
maturities, the PCA approach creates 30 interest rate factors or components, and each 


factor describes a change in each of the 30 rates. This is in contrast to regression 
analysis, which looks at variances of rates and their pairwise correlations. 


PCA sets up the 30 factors with the following properties: 


1. The sum of the variances of the 30 principal components (PCs) equals the sum of the 
variances of the individual rates. The PCs thus capture the volatility of the set of 
rates. 


2. The PCs are not correlated with each other. 


3. Each PC is chosen to contain the highest possible variance, given the earlier PCs. 


The advantage of this approach is that we only really need to describe the volatility and 
structure of the first three PCs since the sum of the variances of the first three PCs isa 
good approximation of the sum of the variances of all rates. Thus, the PCA approach 
creates three factors that capture similar data as a comprehensive matrix containing 
variances and covariances of all interest rate factors. Changes in 30 rates can now be 
expressed with changes in three factors, which is a much simpler approach. 


2) MODULE QUIZ 10.1 


1. If a trader is creating a fixed-income hedge, which hedging methodology would be 
least effective if the trader is concerned about the dispersion of the change in the 
nominal yield for a particular change in the real yield? 

A. One-variable regression hedge. 
B. DVO1 hedge. 

C. Two-variable regression hedge. 
D. Principal components hedge. 


2. Assume that a trader is making a relative value trade, selling a U.S. Treasury bond 
and correspondingly purchasing a U.S. TIPS. Based on the current spread between 
the two securities, the trader shorts $100 million of the nominal bond and purchases 
$89.8 million of TIPS. The trader then starts to question the amount of the hedge due 
to changes in yields on TIPS in relation to nominal bonds. He runs a regression and 
determines from the output that the nominal yield changes by 1.0274 basis points per 
basis point change in the real yield. Would the trader adjust the hedge, and if so, by 
how much? 

A. No. 

B. Yes, by $2.46 million (purchase additional TIPS). 
C. Yes, by $2.5 million (sell a portion of the TIPS). 
D. Yes, by $2.11 million (purchase additional TIPS). 


3. What is a key advantage of using a regression hedge to fine tune a DVO1 hedge? 
A. It assumes that term structure changes are driven by one factor. 
B. The proper hedge amount may be computed for any assumed change in the term 
structure. 
C. Bond price changes and returns can be estimated with proper measures of price 
sensitivity. 
D. It gives an estimate of the hedged portfolio’s volatility over time. 


4. What does the regression hedge assume about the hedge coefficient, beta? 
A. It moves in lockstep with real rates. 
B. It stays constant over time. 
C. It generally tracks nominal rates over time. 


D. It is volatile over time, similar to both real and nominal rates. 


5. Traci York, FRM, is setting up a regression-based hedge and is trying to decide 
between a changes-in-yields-on-changes-in-yields approach versus a yields-on-yields 
approach. Which of the following is a correct statement concerning error terms in 
these two approaches? 


A. In both cases, the error terms are completely uncorrelated. 

B. With change-on-change, there is no correlation in error terms, while yield-on-yield 
error terms are completely correlated. 

C. Error terms are correlated over time with both approaches. 

D. With yield-on-yield, there is no correlation in error terms, while change-on-change 
error terms are completely correlated. 


KEY CONCEPTS 


LO 10.a 


A DV01-neutral hedge assumes the yield on a bond and the yield on a hedging 
instrument rise and fall by the same number of basis points. However, empirically, a 
nominal yield will adjust by more than one basis point for every basis point adjustment 


in areal yield. 

LO 10.b 

A regression hedge adjusts for the extra movement in the projected nominal yield 
changes compared to the projected real yield changes. 

LO 10.c 


Least squares regression is conducted to analyze the changes in historical yields 
between nominal and real bonds. 


Ay = a+ Bay,® +e, 


LO 10.d 
A DV01 hedge is adjusted by the hedge adjustment factor, or beta, as follows: 


FR= FN x po xB 


LO 10.e 


Regression hedging can also be conducted with a two-variable regression model. The 
beta coefficients in the regression model represent risk weights, which are used to 
calculate the face value of multiple offsetting positions. 


LO 10.f 


The formula for a level-on-level regression with dependent variable y and independent 
variable x is as follows: 


y,= a+ fx, tE 


The formula for a change-on-change regression is as follows: 


Yt — Ye = Ay, = a + BAX, + Ac, 


With both approaches, the error terms are unlikely to be independent of each other. 


LO 10.g 


Principal components analysis (PCA) provides a single empirical description of term 
structure behavior, which can be applied across all bonds. The advantage of this 
approach is that we only need to describe the volatility and structure of a small 
number of principal components, which approximate all movements in the term 
structure. 


ANSWER KEY FOR MODULE QUIZ 


Module Quiz 10.1 


1.B The DV01 hedge assumes that the yield on the bond and the assumed hedging 
instruments rises and falls by the same number of basis points; so with a DV01 
hedge, there is not much the trader can do to allow for dispersion between 
nominal and real yields. (LO 10.a) 


2.B The trader would need to adjust the hedge as follows: 
$89.8 million x 1.0274 = $92.26 million 
Thus, the trader needs to purchase additional TIPS worth $2.46 million. (LO 10.d) 


3.D A key advantage of using a regression approach in setting up a hedge is that it 
automatically gives an estimate of the hedged portfolio’s volatility. (LO 10.b) 


4.B It should be pointed out that while it is true that the regression hedge assumes a 
constant beta, this is not a realistic assumption; thus, it is best to estimate beta 
over several time periods and compare accordingly. (LO 10.d) 


5.C With the level-on-level approach, error terms are somewhat correlated over time, 
while with the change-on-change approach, the error terms are completely 


correlated. Thus, error terms are correlated over time with both approaches. (LO 
10.f) 


The following is a review of the Market Risk Measurement and Management principles designed to address the 
learning objectives set forth by GARP®. Cross-reference to GARP assigned reading—Tuckman and Serrat, Chapter 7. 


READING 11 


THE SCIENCE OF TERM 
STRUCTURE MODELS 


Study Session 3 


EXAM FOCUS 


The emphasis of this reading is the pricing of interest rate derivative contracts using a 
risk-neutral binomial model. The pricing process for interest rate derivatives requires 
intensive calculations and is very tedious. However, the relationship becomes 
straightforward when it is modeled to support risk neutrality. Understand the concepts 
of backward induction and how the addition of time steps will increase the accuracy of 
any bond pricing model. Bonds with embedded options are also discussed in this 
reading. Be familiar with the price-yield relationship of both callable and putable 
bonds. This reading incorporates elements of material from the FRM Part I curriculum 
where you valued options with binomial trees. 


MODULE 11.1: INTEREST RATE TREES AND RISK- 
NEUTRAL PRICING 


LO 11.a: Calculate the expected discounted value of a zero-coupon security using 
a binomial tree. 


LO 11.b: Construct and apply an arbitrage argument to price a call option ona 
zero-coupon security using replicating portfolios. 


The binomial interest rate model is used throughout this reading to illustrate the issues 
that must be considered when valuing bonds with embedded options. A binomial 
model is a model that assumes that interest rates can take only one of two possible 
values in the next period. 


This interest rate model makes assumptions about interest rate volatility, along with a 
set of paths that interest rates may follow over time. This set of possible interest rate 
paths is referred to as an interest rate tree. 


Binomial Interest Rate Tree 


The diagram in Figure 11.1 depicts a binomial interest rate tree. 


Figure 11.1: 2-Period Binomial 


To understand this 2-period binomial tree, consider the nodes indicated with the boxes 
in Figure 11.1. A node is a point in time when interest rates can take one of two possible 
paths—an upper path, U, or a lower path, L. Now consider the node on the right side of 
the diagram where the interest rate i, ¿y appears. This is the rate that will occur if the 
initial rate, ip, follows the lower path from node 0 to node 1 to become /, ; then follows 
the upper of the two possible paths to node 2, where it takes on the value i, ,y. At the 
risk of stating the obvious, the upper path from a given node leads to a higher rate than 
the lower path. Notice also that an upward move followed by a downward move gets us 
to the same place on the tree as a down-then-up move, so iz 1y = iz yL 


The interest rates at each node in this interest rate tree are 1-period forward rates 
corresponding to the nodal period. Beyond the root of the tree, there is more than one 
1-period forward rate for each nodal period (i.e., at Year 1, we have two 1-year forward 
rates, i4 y and i; ;, ). The relationship among the rates associated with each individual 
nodal period is a function of the interest rate volatility assumption of the model being 
employed to generate the tree. 


Constructing the Binomial Interest Rate Tree 


The construction of an interest rate tree, binomial or otherwise, is a tedious process. In 
practice, the interest rate tree is usually generated using specialized computer 
software. There is one underlying rule governing the construction of an interest rate 
tree: The values for on-the-run issues generated using an interest rate tree should 
prohibit arbitrage opportunities. This means that the value of an on-the-run issue 
produced by the interest rate tree must equal its market price. It should be noted that 
in accomplishing this, the interest rate tree must maintain the interest rate volatility 
assumption of the underlying model. 


Valuing an Option-Free Bond With the Tree, Using 
Backward Induction 


Backward induction refers to the process of valuing a bond using a binomial interest 
rate tree. The term “backward” is used because in order to determine the value of a 
bond at node 0, you need to know the values that the bond can take on at node 1. But to 
determine the values of the bond at node 1, you need to know the possible values of the 
bond at node 2, and so on. Thus, for a bond that has N compounding periods, the current 
value of the bond is determined by computing the bond’s possible values at period N 
and working “backward” to node 0. 


Consider the binomial tree shown in Figure 11.2 for a $100 face value, zero-coupon 
bond, with two years remaining until maturity, and a market price of $90.006. Starting 
on the top line, the blocks at each node include the value of the bond and the 1-year 
forward rate at that node. For example, at the upper path of node 1, the price is $93.299, 
and the 1-year forward rate is 7.1826%. 


Figure 11.2: Valuing a 2-Year, Zero-Coupon, Option-Free Bond 


$100.00 


A 
Fá 

<90.00( f 
t.5749% \ 


Know that the value of a bond at a given node in a binomial tree is the average of the 
present values of the two possible values from the next period. The appropriate discount 
rate is the forward rate associated with the node under analysis. 


EXAMPLE: Valuing an option-free bond 


Assuming the bond’s market price is $90.006, demonstrate that the tree in Figure 
11.2 is arbitrage free using backward induction. 


Answer: 
Consider the value of the bond at the upper node for period 1, V, y: 


_ ($100 x 0.5) + ($100 x 0.5) 


Lu = 1.071826 ed 
Similarly, the value of the bond at the lower node for period 1, V, ; is: 


_ ($100 x 0.5) + ($100 x 0.5) 
IL 1.053210 


= $94.948 


Now calculate Vo, the current value of the bond at node 0: 


_ ($93.299 x 0.5) + ($94.948 x 0.5) 


a5 1.045749 = $90.006 


Since the computed value of the bond equals the market price, the binomial tree is 
arbitrage free. 


PROFESSOR’S NOTE 

ê When valuing bonds with coupon payments, you need to add the coupons to 
the bond prices at each node. For example, with a $100 face value, 7% annual 
coupon bond, you would add the $7 coupon to each price before computing 
present values. Valuing coupon-paying bonds with a binomial tree will be 
illustrated in LO 11.e. 


LO 11.c: Define risk-neutral pricing and apply it to option pricing. 


LO 11.d: Distinguish between true and risk-neutral probabilities and apply this 
difference to interest rate drift. 


Using the 0.5 probabilities for up and down states as shown in the previous example 
may not produce an expected discounted value that exactly matches the market price 
of the bond. This is because the 0.5 probabilities are the assumed true probabilities of 
price movements. In order to equate the discounted value using a binomial tree and the 
market price, we need to use what is known as risk-neutral probabilities. Any 
difference between the risk-neutral and true probabilities is referred to as the interest 
rate drift. 


Using the Risk-Neutral Interest Rate Tree 


There are actually two ways to compute bond and bond derivative values using a 
binomial model. These techniques are referred to as risk-neutral pricing. 


= The first method is to start with spot and forward rates derived from the current 
yield curve and then adjust the interest rates on the paths of the tree so that the value 
derived from the model is equal to the current market price of an on-the-run bond 
(i.e., the tree is created to be “arbitrage free”). This is the method we used in the 
previous example. Once the interest rate tree is derived for an on-the-run bond, we 
can use it to price derivative securities on the bond by calculating the expected 
discounted value at each node using the real-world probabilities. 


= The second method is to take the rates on the tree as given and then adjust the 
probabilities so that the value of the bond derived from the model is equal to its 
current market price. Once we derive these risk-neutral probabilities, we can use 
them to price derivative securities on the bond by once again calculating the 
expected discounted value at each node using the risk-neutral probabilities and 
working backward through the tree. 


The value of the derivative is the same under either method. 


=) MODULE QUIZ 11.1 


1. Which of the following statements concerning the calculation of value at a node ina 
fixed-income binomial interest rate tree is most accurate? The value at each node is 
the: 

A. present value of the two possible values from the next period. 

B. average of the present values of the two possible values from the next period. 
C. sum of the present values of the two possible values from the next period. 

D. average of the future values of the two possible values from the next period. 


MODULE 11.2: BINOMIAL TREES 


LO 11.e: Explain how the principles of arbitrage pricing of derivatives on fixed- 
income securities can be extended over multiple periods. 


There are three basic steps to valuing an option on a fixed-income instrument using a 
binomial tree: 


Step 1: Price the bond value at each node using the projected interest rates. 
Step 2: Calculate the intrinsic value of the derivative at each node at maturity. 


Step 3: Calculate the expected discounted value of the derivative at each node using the 
risk-neutral probabilities and working backward through the tree. 


Note that the option cannot be properly priced using expected discounted values 
because the call option value depends on the path of interest rates over the life of the 
option. Incorporating the various interest rate paths will prohibit arbitrage from 
occurring. 


EXAMPLE: Call option 


Assume that you want to value a European call option with two years to expiration 
and a strike price of $100.00. The underlying is a 7%, annual coupon bond with 
three years to maturity. Figure 11.3 represents the first two years of the binomial 
tree for valuing the underlying bond. Assume that the risk-neutral probability of an 
up move is 0.76 in Year 1 and 0.60 in Year 2. 


Fill in the missing data in the binomial tree, and calculate the value of the 
European call option. 


Si PROFESSOR’S NOTE 
Since the option is European, it can only be exercised at maturity. 


Figure 11.3: Incomplete Binomial Tree for European Call Option on 3-Year, 
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Answer: 


Step 1: Calculate the bond prices at each node using the backward induction 


methodology. Remember to add the 7% coupon payment to the bond price 
at each node when discounting prices. 


At the middle node in Year 2, the price is $100.62. You can calculate this by 


noting that at the end of Year 2 the bond has one year left to maturity:. 
N = 1; I/Y = 6.34; PMT = 7; FV = 100; CPT — PV = 100.62 

At the bottom node in Year 2, the price is $102.20: 
N = 1; I/Y = 4.70; PMT = 7; FV = 100; CPT — PV = 102.20 

At the top node in Year 1, the price is $100.37: 


($105.56 x 0.6) + ($107.62 x04) _ 0 4, 
1.0599 = $100. 


At the bottom node in Year 1, the price is $103.65: 


($107.62 x 0.6) + ($109.20 x 0.9 _ < 103.65 
E S e  ~ 51036 


Today, the price is $105.01: 


107.37 x 0.76) + ($110.65 x 0.24 
SS ee $105.01 


As shown here, the price at a given node is the expected discounted value of the 
cash flows associated with the two nodes that “feed” into that node. The discount 
rate that is applied is the prevailing interest rate at the given node. Note that since 
this is a European option, you really only need the bond prices at the maturity date 
of the option (end of Year 2) if you are given the arbitrage-free interest rate tree. 
However, it’s good practice to compute all the bond prices. 


Step 2: Determine the intrinsic value of the option at maturity in each node. For 
example, the intrinsic value of the option at the bottom node at the end of 


Year 2 is $2.20 = $102.20 - $100.00. At the top node in Year 2, the intrinsic 
value of the option is zero since the bond price is less than the call price. 


Step 3: Using the backward induction methodology, calculate the option value at 
each node prior to expiration. For example, at the top node for Year 1, the 
option price is $0.23: 


($0.00 x 0.6) + ($0.62 x 0.4) _ a 
1.0599 = $0.2 


Figure 11.4 shows the binomial tree with all values included. 


Figure 11.4: Completed Binomial Tree for European Call Option on 3-Year, 
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Bond price = $102.20 
0.40 | Option value = $2.20 


The option value today is computed as: 
($0.23 x 0.76) + ($1.20 x 0.24) 
1.03 


= $0.45 


Recombining and Nonrecombining 


LO 11.g: Describe the rationale behind the use of recombining trees in option 
pricing. 


In the previous example, the interest rate in the middle node of period two was the 
same (i.e., 6.34%) regardless of the path being up then down or down then up. This is 
known as a recombining tree. It may be the case, in a practical setting, that the up 
then down scenario produces a different rate than the down-then-up scenario. An 
example of this type of tree may result when any interest rate above a certain level (e.g, 
3%) causes rates to move a fixed number of basis points, but any interest rate below 
that level causes rates to move at a pace that is below the up state’s fixed amount. 
When rates move in this fashion, the movement process is known as state-dependent 
volatility, and it results in nonrecombining trees. From an economic standpoint, 


nonrecombining trees are appropriate; however, prices can be very difficult to calculate 
when the binomial tree is extended to multiple periods. 


Constant-Maturity Treasury Swap 


LO 11.h: Calculate the value of a constant-maturity Treasury swap, given an 
interest rate tree and the risk-neutral probabilities. 


In addition to valuing options with binomial interest rate trees, we can also value other 
derivatives such as swaps. The following example calculates the price of a constant- 
maturity Treasury (CMT) swap. A CMT swap is an agreement to swap a floating rate 
for a Treasury rate such as the 10-year rate. 


EXAMPLE: CMT swap 
Assume that you want to value a constant-maturity Treasury (CMT) swap. The 
swap pays the following every six months until maturity: 
$1,000,000 4 
(=) X (¥casr — 7%) 


Ycmr İS a semiannually compounded yield, of a predetermined maturity, at the time 
of payment (Ycy7 is equivalent to 6-month spot rates). Assume there is a 76% risk- 


neutral probability of an increase in the 6-month spot rate and a 60% risk-neutral 
probability of an increase in the 1-year spot rate. 


Fill in the missing data in the binomial tree, and calculate the value of the swap. 


Figure 11.5: Incomplete Binomial Tree for CMT Swap 
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Answer: 


In six months, the top node and bottom node payoffs are, respectively: 


_ $1,000,000 


payoff, y = roe (7.25% — 7.00%) = $1,250 
1,000,000 
payoff; p = $1,009,000 x (6.75% — 7.00%) = — $1,250 


Similarly in one year, the top, middle, and bottom payoffs are, respectively: 


$1,000,000 
payoff 5 y = ae (7.50% —7.00%) = $2,500 
$1,000,000 
payoff, qy = ao ace x (7.00% — 7.00%) = $0 
1,000,000 
payoff, = ee x (6.50% — 7.00%) = — $2,500 


The possible prices in six months are given by the expected discounted value of the 
1-year payoffs under the risk-neutral probabilities, plus the 6-month payoffs 
($1,250 and -$1,250). Hence, the 6-month values for the top and bottom node are, 
respectively: 


__ ($2,500 x 0.6) + ($0 x 0.4) 


Vig 00725 + $1,250 = $2,697.53 
1+ = 
2 
Vig ee OE Ea E, $1,250 = —$2,217.35 
LL = o os «911250 = — $2,217. 
L = 


Today the price is $1,466.63, calculated as follows: 


__ ($2,697.53 x 0.76) + (-$2,217.35 x 0.24) 
0 0.07 
+ —_ 
9 


= $1,466.63 
1 


Á| 


Figure 11.6 shows the binomial tree with all values included. 


Figure 11.6: Completed Binomial Tree for CMT Swap 
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=) MODULE QUIZ 11.2 


1. A European put option has two years to expiration and a strike price of $101.00. The 
underlying is a 7% annual coupon bond with three years to maturity. Assume that the 
risk-neutral probability of an up move is 0.76 in Year 1 and 0.60 in Year 2. The 
current interest rate is 3.00%. At the end of Year 1, the rate will either be 5.99% or 
4.44%. If the rate in Year 1 is 5.99%, it will either rise to 8.56% or rise to 6.384% in 
Year 2. If the rate in one year is 4.44%, it will either rise to 6.34% or rise to 4.70%. 
The value of the put option today is closest to: 

A. $1.17. 
B. $1.30. 
C. $1.49. 
D. $1.98. 


MODULE 11.3: OPTION-ADJUSTED SPREAD 
LO 11.f: Define option-adjusted spread (OAS) and apply it to security pricing. 


The option-adjusted spread (OAS) is the spread that makes the model value (calculated 
by the present value of projected cash flows) equal to the current market price. In the 
previous CMT example, the model price was equal to $1,466.63. Now assume that the 
market price of the CMT swap was instead $1,464.40, which is $2.23 less than the 
model price. In this case, the OAS to be added to each discounted risk-neutral rate in 
the CMT swap binomial tree turns out to be 20 basis points. In six months, the rates to 
be adjusted are 7.25% in the up node and 6.75% in the down node. Incorporating the 
OAS into the six-month rates generates the following new swap values: 

_ ($2,500 x 0.6) + ($0 x 0.4) 

107 1 + 0-0745 
2 


+$1,250 = $2,696.13 


_ ($0 x 0.6) + (— $2,500 x 0.4) 
K O_o 
7 1+ 0.069 


la 


V $1,250 = $2,216. 


$ 


Notice that the only rates adjusted by the OAS spread are the rates used for discounting 
values. The OAS does not impact the rates used for estimating cash flows. The final step 
in this CMT swap valuation is to adjust the interest rate used to discount the price back 
to today. In this example, the discounted rate of 7% is adjusted by 20 basis points to 
7.2%. The updated initial CMT swap value is: 


Eo ($2,696.13 x 0.76) + (—$2,216.42 x 0.24) _ $1,464.40 
7 1 Hl ? . 
l4 0.072 


2 
Now we can see that adding the OAS to the discounted risk-neutral rates in the 
binomial tree generates a model price ($1,464.40) that is equal to the market price 
($1,464.40). In this example, the market price was initially less than the model price. 


This means that the security was trading cheap. If the market price were instead higher 
than the model price we would say that the security was trading rich. 


Time Steps 


LO 11.i: Evaluate the advantages and disadvantages of reducing the size of the 
time steps on the pricing of derivatives on fixed-income securities. 


For the sake of simplicity, the previous example assumed periods of six months. 
However, in reality, the time between steps should be much smaller. As you can 
imagine, the smaller the time between steps, the more complicated the tree and 
calculations become. Using daily time steps will greatly enhance the accuracy of any 
model but at the expense of additional computational complexity. 


Fixed-Income Securities and Black-Scholes-Merton 


LO 11.j: Evaluate the appropriateness of the Black-Scholes-Merton model when 
valuing derivatives on fixed-income securities. 


The Black-Scholes-Merton model is the most well-known equity option pricing model. 
Unfortunately, the model is based on three assumptions that do not apply to fixed- 
income securities: 

1. The model’s main shortcoming is that it assumes there is no upper limit to the price 
of the underlying asset. However, bond prices do have a maximum value. This upper 
limit occurs when interest rates equal zero so that zero-coupon bonds are priced at 
par and coupon bonds are priced at the sum of the coupon payments plus par. 


2. It assumes the risk-free rate is constant. However, changes in short-term rates do 
occur, and these changes cause rates along the yield curve and bond prices to change. 


3. It assumes bond price volatility is constant. With bonds, however, price volatility 
decreases as the bond approaches maturity. 


Bonds With Embedded Options 


Fixed-income securities are often issued with embedded options, such as a call 
feature. In this case, the price-yield relationship will change, and so will the price 
volatility characteristics of the issue. 


Callable Bonds 


A call option gives the issuer the right to buy back the bond at fixed prices at one or 
more points in the future, prior to the date of maturity. Since the investor takes a short 
position in the call, the right to purchase rests with the issuer. Such bonds are deemed 
to be callable (note that a call provision on a bond is analytically similar to a 
prepayment option). 


Figure 11.7: Price-Yield Function of Callable Bond 
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For an option-free noncallable bond, prices will fall as yields rise, and prices will rise 
unabated as yields fall—in other words, they’ll move as expected with yields. That’s not 
the case, however, with callable bonds. As you can see in Figure 11.7, the decline in 
callable bond yield will reach the point where the rate of increase in the price of the 
callable bond will start slowing down and eventually level off. 


This is known as negative convexity. Such behavior is due to the fact that the issuer 
has the right to retire the bond prior to maturity at some specified call price. The call 
price, in effect, acts to hold down the price of the bond (as rates fall) and causes the 
price-yield curve to flatten. The point where the curve starts to flatten is at (or near) a 
yield level of y’. Note that as long as yields remain above y’, a callable bond will behave 
like any option-free (noncallable) issue and exhibit positive convexity. That’s because at 
high yield levels, there is little chance of the bond being called. 


Below y’, investors begin to anticipate that the firm may call the bond, in which case 
investors will receive the call price. Therefore, as yield levels drop, the bond’s market 
value is bounded from above by the call price. Thus, callability effectively caps the 
investor’s capital gains as yields fall. Moreover, it exacerbates reinvestment risk since it 
increases the cash flow that must be reinvested at lower rates (i.e., without the call or 
prepayment option, the cash flow will only be the coupon; with the option, the cash 
flow is the coupon plus the call price). 


Thus, in Figure 11.7, as long as yields remain below y’, callable bonds will exhibit price 
compression, or negative convexity; however, at yields above y’, those same callable 
bonds will exhibit all the properties of positive convexity. 


Putable Bonds 


The put feature in putable bonds is another type of embedded option. The put feature 
gives the bondholder the right to sell the bond back to the issuer at a set price (i.e., the 
bondholder can “put” the bond to the issuer). The impact of the put feature on the 
price-yield relationship is shown in Figure 11.8. 


Figure 11.8: Price-Yield Function of a Putable Bond 
Pr ice 
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At low yield levels relative to the coupon rate, the price-yield relationship of putable 
and nonputable bonds is similar. However, as shown in Figure 11.8, if yields rise above 
y’, the price of the putable bond does not fall as rapidly as the price of the option-free 
bond. This is because the put price serves as a floor value for the price of the bond. 


2) MODULE QUIZ 11.3 


— 1. The Black-Scholes-Merton option pricing model is not appropriate for valuing options 
on corporate bonds because corporate bonds: 
A. have credit risk. 
B. have an upper price bound. 
C. have constant price volatility. 


D. are not priced by arbitrage. 


2. Which of the following regarding the use of small time steps in the binomial model is 
true? 


A. Less realistic model. 

B. More accurate model. 

C. Less complicated computations. 
D. Less computational expense. 


3. Which of the following statements about callable bonds compared to noncallable bonds 
is false? 


A. They have less price volatility. 

B. They have negative convexity. 

C. Capital gains are capped as yields rise. 

D. At low yields, reinvestment rate risk rises. 


KEY CONCEPTS 


LO 1l.a 


Backward induction methodology with a binomial model requires discounting of the 
cash flows that occur at each node in an interest rate tree (bond value plus coupon 
payment) backward to the root of the tree. 


LO 11.b 


The values for on-the-run issues generated using an interest rate tree should prohibit 
arbitrage opportunities. 


LO 11.c 


Risk-neutral, or no-arbitrage, binomial tree models are used to allow for proper 
valuation of bonds with embedded options. 


LO 11.d 


Using the average of present values of the two possible values from the next period may 
not produce an expected discounted bond value that exactly matches the market price 
of the bond. 


Using 0.5 probabilities for up and down states are the assumed true probabilities of 
price movements. In order to equate the discounted value using a binomial tree and the 
market price, we need to use risk-neutral probabilities. 


LO 11.e 

To value an option on a fixed-income instrument using a binomial tree: 

1. Price the bond at each node using projected interest rates. 

2. Calculate the intrinsic value of the derivative at each node at maturity. 


3. Calculate the expected discounted value of the derivative at each node using the risk- 
neutral probabilities and working backward through the tree. 


Callable bonds can be valued by modifying the cash flows at each node in the interest 
rate tree to reflect the cash flow prescribed by the embedded call option. 


LO 11.f 


The option-adjusted spread (OAS) allows a security’s model price to equal its market 
price. It is added to any rate in the interest rate tree that is used for discounting 
purposes. 


LO 11.g 


Nonrecombining trees result when the up-down scenario produces a different rate than 
the down-up scenario. 


LO 11.h 


A constant-maturity Treasury (CMT) swap is an agreement to swap a floating rate for a 
Treasury rate. 


LO 11.i 


The precision of a model can be improved by reducing the length of the time steps, but 
the tradeoff is increased complexity. 


LO 11,j 


The Black-Scholes-Merton model cannot be used for the valuation of fixed-income 
securities because it makes the following unreasonable assumptions: 


= There is no upper price bound. 
a The risk-free rate is constant. 


= Bond volatility is constant. 


ANSWER KEY FOR MODULE QUIZZES 
Module Quiz 11.1 


1.B The value at any given node in a binomial tree is the average present value of the 
cash flows at the two possible states immediately to the right of the given node, 
discounted at the 1-period rate at the node under examination. (LO 11.a) 


Module Quiz 11.2 


1.A This is the same underlying bond and interest rate tree as in the call option 
example from this reading. However, here we are valuing a put option. 


The option value in the upper node at the end of Year 1 is computed as: 
2.44 x 0.6) + ($0.38 x 0.4 
1.0599 


The option value in the lower node at the end of Year 1 is computed as: 
($0.38 x 0.6) + ($0.00 x 0.4) _ 
1.0444 p 

The option value today is computed as: 


($1.52 x 0.76) + ($0.22 x 0.24) _ 
1.0300 = 31.17 


$0.22 


(LO 11.e) 


Module Quiz 11.3 


1.B The Black-Scholes-Merton model cannot be used for the valuation of fixed-income 


securities because it makes the following assumptions, which are not reasonable 
for valuing fixed-income securities: 


= There is no upper price bound. 
= The risk-free rate is constant. 
= Bond volatility is constant. 


(LO 11j) 


2.B The use of small time steps in the binomial model yields a more realistic model, a 
more accurate model, more complicated computations, and more computational 
expense. (LO 11.i) 

3. C Callable bonds have the following characteristics: 
= Less price volatility. 
= Negative convexity. 
= Capital gains are capped as yields fall. 


= Exhibit increased reinvestment rate risk when yields fall. 


(LO 11j) 


The following is a review of the Market Risk Measurement and Management principles designed to address the 
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READING 12 


THE EVOLUTION OF SHORT RATES 
AND THE SHAPE OF THE TERM 
STRUCTURE 


Study Session 3 


EXAM FOCUS 


This reading discusses how the decision tree framework is used to estimate the price 
and returns of zero-coupon bonds. This decision tree framework illustrates how 
interest rate expectations determine the shape of the yield curve. For the exam, 
candidates should understand how current spot rates and forward rates are determined 
by the expectations, volatility, and risk premiums of short-term rates. Furthermore, the 
use of Jensen’s inequality should be understood, and candidates should be prepared to 
use this formula to demonstrate how maturity and volatility increase the convexity of 
zero-coupon bonds. 


MODULE 12.1: INTEREST RATES 


Interest Rate Expectations 


LO 12.a: Explain the role of interest rate expectations in determining the shape 
of the term structure. 


Expectations of future interest rates are based on uncertainty. For example, an investor 
may expect that interest rates over the next period will be 8%. However, this investor 
may realize there is also a high probability that interest rates could be 7% or 9% over 
the next period. 


Expectations play an important role in determining the shape of the yield curve and 
can be illustrated by examining yield curves that are flat, upward-sloping, and 
downward-sloping. In the following yield curve examples assume future interest rates 
are known and that there is no uncertainty in rates. 


Flat Yield Curve 


Suppose the 1-year interest rate is 8% and future 1-year forward rates are 8% for the 
next two years. Given these interest rate expectations, the present values of 1-, 2-, and 
3-year zero-coupon bonds with $1 face values assuming annual compounding are 
calculated as follows: 


price of 1-year zero-coupon bond = os = $0.92593 
price of 2-year zero-coupon bond = = a = $0.85734 
i (1.08)(1.08) 
price of 3-year zero-coupon bond = a = $0.79383 
=n coop (1.08X(1.08X1.08) > 


In this example, investors expect the 1-year spot rates for the next three years to be 8%. 
Thus, the yield curve is flat and investors are willing to lock in interest rates for two or 
three years at 8%. 


Upward-Sloping Yield Curve 


Now suppose the 1-year interest rate will remain at 8%, but investors expect the 1-year 
rate in one year to be 10% and the 1-year rate in two years to be 12%. The 2-year spot 
rate, f (2), must satisfy the following equation: 


$1 $1 


ice of 2-ve ro-c = —— 
price of 2-year zero-coupon bond (1.08\(1.10) 


(1+2) 


Cross multiplying, taking the square root of each side, and subtracting one from each 
side results in the following equation: 


2) =~/(1.08\(1.10) — 1 = \/1.188—1 
Solving this equation reveals that the 2-year spot rate is 8.995%. 


The 3-year spot rate can be solved in a similar fashion: 


$1 $1 


ric f 3-ve TO-C = ———— 
price of 3-year zero-coupon bond (1.08X1.10X1.12) 


EN. 
(1 + 1(3)) 
Thus, the 3-year spot rate is computed as follows: 


$(3) =~/(1.08)(1.10)(1.12) — 1 = (1.33056) — 1 = 9.988% 


If expected 1-year spot rates for the next three years are 8%, 10%, and 12%, then this 
results in an upward-sloping yield curve of 1-, 2-, and 3-year spot rates of 8%, 8.995%, 
and 9.988%, respectively. Thus, the expected future spot rates will determine the 
upward-sloping shape of the yield curve. 


Downward-Sloping Yield Curve 


Now suppose the 1-year interest rate will remain at 8%, but investors expect the 1-year 
rate in one year to be 6% and the 1-year rate in two years to be 4%. The 2-year and 3- 
year spot rates are computed as follows: 


$(2) =~/(1.08)(1.06) — 1 = 6.995% 
X3) = ~~/(1.08)(1.06)(1.04) — 1 = 5.987% 


These calculations imply a downward-sloping yield curve for 1-, 2-, and 3-year spot 
rates of 8%, 6.995%, and 5.987%, respectively. 


These three examples illustrate that expectations of future interest rates can describe 
the shape and level of the term structure for short-term horizons. If expected 1-year 
spot rates for the next three years are rj, rz, and r3, then the 2-year and 3-year spot 


rates are computed as: 

2(2) =\/(1 +10 +1) — 1 

N3) =\/( +r +121 +13) — 1 
In the short run, expected future spot rates determine the shape of the yield curve. In 
the long run, however, it is less likely that investors have confidence in 1-year spot 
rates several years from now (e.g. 30 years). Thus, expectations are unable to describe 
the shape of the term structure for long-term horizons. However, it is reasonable to 
assume that real rates and inflation rates are relatively constant over the long run. For 
example, a short-term rate of 5% may imply a long-run real rate of interest of 3% anda 
long-run inflation rate of 2%. Thus, interest rate expectations can describe the level of 
interest rates for long-term horizons. 


Interest Rate Volatility 


LO 12.b: Apply a risk-neutral interest rate tree to assess the effect of volatility on 
the shape of the term structure. 


Suppose an investor is risk-neutral and has uncertainty regarding expectations of future 
interest rates. The decision tree in Figure 12.1 represents expectations that are used to 
determine the 1-year rate. If there is a 50% probability that the 1-year rate in one year 
will be 10% and a 50% probability that the 1-year rate in one year will 6%, the 
expected 1-year rate in one year can be calculated as: 


(0.5 x 10%) + (0.5 x 6%) = 8% 


Using the expected rates in Figure 12.1, the price of a 1-year zero-coupon bond is 
0.92593% of par (calculated as $1 / 1.08 with a $1 par value). 


The last column of the decision tree in Figure 12.1 illustrates the joint probabilities of 
12%, 8%, or 4% 1-year rates in two years. The 12% rate in the upper node occurs 25% 
of the time (50% x 50%), the 8% rate in the middle node occurs 50% of the time (50% 
x 50% + 50% x 50%), and the 4% rate in the bottom node occurs 25% of the time (50% 
x 50%). Thus, the expected 1-year rate in two years is calculated as: 


(0.25 x 12%) + (0.50 x 8%) + (0.25 x 4%) = 8% 


Figure 12.1: Decision Tree Illustrating Expected 1-Year Rates for Two Years 
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Assuming risk neutrality, the decision tree in Figure 12.2 illustrates the calculation of 
the price of a 2-year zero-coupon bond with a face value of $1 using the expected rates 
given in the Figure 12.1 decision tree. The expected price in one year for the upper node 
is $0.90909, calculated as $1 / 1.10. The expected price in one year for the lower node is 
$0.94340, calculated as $1 / 1.06. Thus, the current price of the 2-year zero-coupon 
bond is calculated as: 


[0.5 x ($0.90909 / 1.08)] + [0.5 x ($0.94340 / 1.08)] = $0.85763 


Figure 12.2: Risk-Neutral Decision Tree for a 2-Year Zero-Coupon Bond 
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With the present value of the 2-year zero-coupon bond, we can computed the implied 
2-year spot rate by solving for ?(2) as follows: 
$1 


$0.85763 = =e 
(1 +2) 


1 


1(2) = \V0.85763 —1= 0.079816 or 7.9816% 
Alternatively, this can also be computed using a financial calculator as follows: 

PV = —0.85763; FV = 1; PMT = 0; N = 2; CPT — I/Y = 7.9816% 
This example illustrates that when there is uncertainty regarding expected rates, the 
volatility of expected rates causes the future spot rates to be lower. With the implied 


rate, we can compute the value of convexity for the 2-year zero-coupon bond as: 8% - 
7.9816% = 0.0184% or 1.84 basis points. 


2) MODULE QUIZ 12.1 

= 1. An investor expects the current 1-year rate for a zero-coupon bond to remain at 6%, 
the 1-year rate next year to be 8%, and the 1-year rate in two years to be 10%. What is 
the 3-year spot rate for a zero-coupon bond with a face value of $1, assuming all 
investors have the same expectations of future 1-year rates for zero-coupon bonds? 
A. 7.888%. 
B. 7.988%. 
C. 8.000%. 
D. 8.088%. 


2. Suppose investors have interest rate expectations as illustrated in the decision tree 
below where the 1-year rate is expected to be 8%, 6%, or 4% in the second year and 
either 7% or 5% in the first year for a zero-coupon bond. 


50% —o* 
Pa 
Th a 
i — 50% ~ 


so~ 


~ 4% 
If investors are risk-neutral, what is the price of a $1 face value 2-year zero-coupon 
bond today? 

A. $0.88113. 

B. $0.88634. 

C. $0.89007. 

D. $0.89032. 


3. If investors are risk-neutral and the price of a 2-year zero-coupon bond is $0.88035 
today, what is the implied 2-year spot rate? 
A. 4.839%. 
B. 5.230%. 
C. 5.827%. 
D. 6.579%. 


MODULE 12.2: CONVEXITY AND RISK PREMIUM 


Convexity Effect 


LO 12.c: Estimate the convexity effect using Jensen’s inequality. 


LO 12.d: Evaluate the impact of changes in maturity, yield, and volatility on the 
convexity of a security. 


The convexity effect can be measured by applying a special case of Jensen’s inequality 
as follows: 


Elatal * Efl+d 


EXAMPLE: Applying Jensen’s inequality 


Assume that next year there is a 50% probability that 1-year spot rates will be 
10% and a 50% probability that 1-year spot rates will be 6%. Demonstrate 
Jensen’s inequality for a 2-year zero-coupon bond with a face value of $1 assuming 
the previous interest rate expectations shown in Figure 12.1. 


Answer: 
The left-hand side of Jensen’s inequality is the expected price in one year using the 
1-year spot rates of 10% and 6%. 


$1 = ee E E 


5 x 5x = $0.9262 
(+n ig dey Oae 


The expected price in one year using an expected rate of 8% computes the right- 
hand side of the inequality as: 


$1 $1 N 
— I — T 9759 
0.5x1.10+05x1.06 1.08 o 


Thus, the left-hand side is greater than the right-hand side, $0.92624 > $0.92593. 


If the current 1-year rate is 8%, then the price of a 2-year zero-coupon bond is 
found by simply dividing each side of the equation by 1.08. In other words, 
discount the expected 1-year zero-coupon bond price for one more year at 8% to 
find the 2-year price. The price of the 2-year zero-coupon bond on the left-hand 
side of Jensen’s inequality equals $0.85763 (calculated as $0.92624 / 1.08). The 
right-hand side is calculated as the price of a 2-year zero-coupon bond discounted 
for two years at the expected rate of 8%, which equals $0.85734 (calculated as $1 / 
1.087). 


The left-hand side is again greater than the right-hand side, $0.85763 > $0.85734. 


This demonstrates that the price of the 2-year zero-coupon bond is greater than 
the price obtained by discounting the $1 face amount by 8% over the first period 
and by 8% over the second period. Therefore, we know that since the 2-year zero- 
coupon price is higher than the price achieved through discounting, its implied rate 
must be lower than 8%. 


Extending this example out for one more year illustrates that convexity increases with 
maturity. Suppose an investor expects the spot rates to be 14%, 10%, 6%, or 2% in 
three years. Assuming each expected return has an equal probability of occurring 
results in the decision tree shown in Figure 12.3. 


Figure 12.3: Risk-Neutral Decision Tree Illustrating Expected 1-Year Rates for 


Three Years 
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The decision tree in Figure 12.4 uses the expected spot rates from the decision tree in 
Figure 12.3 to calculate the price of a 3-year zero-coupon bond. 


The price of a 1-year zero-coupon bond in two years with a face value of $1 for the 
upper node is $0.89286 (calculated as $1 / 1.12). The price of a 1-year zero-coupon 
bond in two years for the middle node is $0.92593 (calculated as $1 / 1.08). The price 
of a 1-year zero-coupon bond in two years for the bottom node is $0.96154 (calculated 
as $1 / 1.04). 


The price of a 2-year zero-coupon bond in one year using the upper node expected spot 
rates is calculated as: 
[0.5 x ($0.89286 / 1.10)] + [0.5 x ($0.92593 / 1.10)] = $0.82672 


The price of a 2-year zero-coupon bond in one year using the bottom node expected 
spot rates is calculated as: 


[0.5 x ($0.92593 / 1.06)] + [0.5 x ($0.96154 / 1.06)] = $0.89032 
Lastly, the price of a 3-year zero-coupon bond today is calculated as: 


[0.5 x ($0.82672 / 1.08)] + [0.5 x ($0.89032 / 1.08)] = $0.79493 


Figure 12.4: Risk-Neutral Decision Tree for a 3-Year Zero-Coupon Bond 
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To measure the convexity effect, the implied 3-year spot rate is calculated by solving 
for ? (3) in the following equation: 
1 


0.79493 = oe 
(1 + 1(3)) 


r =, E = 795 7 950 
1(3) =\0.79493 1= 0.0795 or 7.95% 


Notice that convexity lowers bond yields and that this reduction in yields is equal to 
the value of convexity. For the 3-year zero-coupon bond, the value of convexity is 8% - 
7.95% = 0.05% or 5 basis points. Recall that the value of convexity for the 2-year zero- 
coupon bond was only 1.84 basis points. Therefore, all else held equal, the value of 
convexity increases with maturity. In other words, as the maturity of a bond increases, 
the price-yield relationship becomes more convex. 


This convexity occurs due to volatility. Thus, we can also say that the value of 
convexity increases with volatility. The following decision trees in Figure 12.5 and 
Figure 12.6 illustrate the impact of increasing the volatility of interest rates. In this 
example, the 1-year spot rate in one year in Figure 12.5 ranges from 2% to 14% instead 
of 4% to 12% as was shown in Figure 12.1. 


Figure 12.5: Risk-Neutral Decision Tree Illustrating Volatility Effect on 


Convexity 
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Using the same methodology as before, the price of a 2-year zero-coupon bond with the 
listed expected interest rates in Figure 12.5 is $0.858. 


Figure 12.6: Price of a 2-Year Zero-Coupon Bond With Increased Volatility 
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This price results in a 2-year implied spot rate of 7.958%. Thus, the value of convexity 
is 8% - 7.958% = 0.042% or 4.2 basis points. This is higher than the previous 2-year 
example where the value of convexity was 1.84 basis points when expected spot rates 
ranged from 4% to 12%, instead of 2% to 14%. Therefore, the value of convexity 
increases with both volatility and time to maturity. 


Risk Premium 


LO 12.e: Calculate the price and return of a zero-coupon bond incorporating a 
risk premium. 


Suppose an investor expects 1-year rates to resemble those in Figure 12.7. In this 
example, there is volatility of 400 basis point of rates per year where 1-year rates in 
one year range from 4% to 12% in the second year. 


Figure 12.7: Decision Tree Illustrating Expected 1-Year Rates for Two Years 
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Next year, the 1-year return will be either 10% or 6%. A risk-neutral investor 
calculates the price of a 2-year zero-coupon bond with a face value of $1 as follows: 


31 +L | x 0.5 
{1.10 1.06) __ [80.90909 + $0.94340] x 0.5 _ 55 gc763 
1.08 1.08 a 


In this example, the price of $0.85763 implies a 1-year expected return of 8%. However, 
this is only the average return. The actual return will be either 6% or 10%. Risk-averse 
investors would require a higher rate of return for this investment than an investment 
that has a certain 8% return with no variability. Thus, risk-averse investors require a 
risk premium for bearing this interest rate risk, and demand a return greater than 8% 
for buying a 2-year zero-coupon bond and holding it for the next year. 


EXAMPLE: Incorporating a risk premium 


Calculate the price and return for the zero-coupon bond using the expected 
returns in Figure 12.7 and assuming a risk premium of 30 basis points for each year 
of interest rate risk. 


Answer: 


The price of a 2-year zero-coupon bond with a 30 basis point risk premium 
included is calculated as: 
| $1 $1 | 
SS x 0. i 
1.103 1.063 _ [$0.90662 + $0.94073] x 0.5 
1.08 z 1.08 


w 


= $0.85525 


Notice that this price is less than the $0.85763 price calculated previously for the 
risk-neutral investor. Next year, the price of the 2-year zero-coupon bond will 
either be $0.90909 or $0.94340, depending on whether the 1-year rate is either 
10% or 6%, respectively. Thus, the expected return for the next year of the 2-year 
zero-coupon bond is 8.3%, calculated as follows: 
($0.90909 + $0.94340) x 0.5 — $0.85525 
$0.85525 


= 0.083 


Therefore, risk-averse investors require a 30 basis point premium or 8.3% return 
to compensate for one year of interest rate risk. For a 3-year zero-coupon bond, 


risk-averse investors will require a 60 basis point premium or 8.6% return given 
two years of interest rate risk. 


PROFESSOR’S NOTE 

ê In the previous example, it is assumed that rates can change only once a year, 
so in the first year there is no uncertainty of interest rates. There is only 
uncertainty in what the 1-year rate will be one and two years from today. 


S MODULE QUIZ 12.2 
= 1. What is the impact on the bond price-yield curve if, all other factors held constant, the 
maturity of a zero-coupon bond increases? The pricing curve becomes: 


A. less concave. 
B. more concave. 
C. less convex. 
D. more convex. 


2. Suppose an investor expects that the 1-year rate will remain at 6% for the first year 
for a 2-year zero-coupon bond. The investor also projects a 50% probability that the 1- 
year spot rate will be 8% in one year and a 50% probability that the 1-year spot rate 
will be 4% in one year. Which of the following inequalities most accurately reflects the 
convexity effect for this 2-year bond using Jensen’s inequality formula? 


A. $0.89031 > $0.89000. 
B. $0.89000 > $0.80000. 
C. $0.94340 > $0.890381. 
D. $0.94373 > $0.94340. 


KEY CONCEPTS 


LO 12.a 
If expected 1-year spot rates for the next three years are rj, rz, and r3, then the 2-year 


spot rate is computed as #(2) =\/(1 +1,)(1 +r,) — 1, and the 3-year spot rate is computed 


as #(3) =\/((1+1)0 +1,(1+1,) — 1. 


LO 12.b 
The volatility of expected rates creates convexity, which lowers future spot rates. 


LO 12.c 
The convexity effect can be measured by using Jensen’s inequality: 


1 1 
Ela L ” E+ 


LO 12.d 

Convexity lowers bond yields due to volatility. This reduction in yields is equal to the 
value of convexity. Thus, we can say that the value of convexity increases with 
volatility. The value of convexity will also increase with maturity, because the price- 
yield relationship will become more convex over time. 


LO 12.e 


Risk-averse investors will price bonds with a risk premium to compensate them for 
taking on interest rate risk. 


ANSWER KEY FOR MODULE QUIZZES 


Module Quiz 12.1 


1.B The 3-year spot rate can be solved for using the following equation: 
$1 $1 


(1.06)(1.08)(1.10) _ (1+ 3)? 
solving for 1(3) = ~/(1.06)(1.08)(1.10) — 1 = 7.988% 
(LO 12.a) 


2.C Assuming investors are risk-neutral, the following decision tree illustrates the 
calculation of the price of a 2-year zero-coupon bond using the expected rates 
given. The expected price in one year for the upper node is $0.93458, calculated 
as $1 / 1.07. The expected price in one year for the lower node is $0.95238, 
calculated as $1 / 1.05. Thus, the current price is $0.89007, calculated as: 

[0.5 x ($0.93458 / 1.06)] + [0.5 x ($0.95238 / 1.06)] = $0.89007 
~$1 


$0.89007 $1 


50% sps, 
$0.95238 "i 
io. 


(LO 12.b) 
3.D The implied 2-year spot rate is calculated by solving for 32) in the following 
equation: 
$1 
$0.88035 = ————{ 
(1+ 2(2))° 
4 | a 
1(2) = Vooss0as — 1 = 0.06579 or 6.579% 


Alternatively, this can also be computed using a financial calculator as follows: 
PV = —0.88035; FV = 1; PMT = 0; N = 2; CPT > I/Y = 6.579% 


(LO 12.b) 
Module Quiz 12.2 


1.D As the maturity of a bond increases, the price-yield relationship becomes more 
convex. (LO 12.d) 


2.A The left-hand side of Jensen’s inequality is the expected price in one year using the 
1-year spot rates of 8% and 4%. 
| $1 | te. M =. $I 
= 0.5 x——— +0.5x —— 
(1 +1) (1.08) (1.04) 


= 0.5 x 0.92593 + 0.5 x $0.96154 = $0.94373 


The expected price in one year using an expected rate of 6% computes the right- 
hand side of the inequality as: 
$1 $1 


ee 9. 
0.5 x 1.08 + 0.5 x 1.04 1.06 = 


Next, divide each side of the equation by 1.06 to discount the expected 1-year 
zero-coupon bond price for one more year at 6%. The price of the 2-year zero- 
coupon bond equals $0.89031 (calculated as $0.94373 / 1.06), which is greater 
than $0.89000 (the price of a 2-year zero-coupon bond discounted for two years 
at the expected rate of 6%). Thus, Jensen’s inequality reveals that $0.89031 > 
$0.89000. (LO 12.c) 


The following is a review of the Market Risk Measurement and Management principles designed to address the 
learning objectives set forth by GARP®. Cross-reference to GARP assigned reading—Tuckman and Serrat, Chapter 9. 


READING 13 


THE ART OF TERM STRUCTURE 
MODELS: DRIFT 


Study Session 3 


EXAM FOCUS 


This reading introduces different term structure models for estimating short-term 
interest rates. Specifically, we will discuss models that have no drift (Model 1), constant 
drift (Model 2), time-deterministic drift (Ho-Lee), and mean-reverting drift (Vasicek). 
For the exam, understand the differences between these short rate models, and know 
how to construct a two-period interest rate tree using model predictions. Also, know 
how the limitations of each model impact model effectiveness. For the Vasicek model, 
understand how to convert a nonrecombining tree into a combining tree. 


MODULE 13.1: TERM STRUCTURE MODELS 


LO 13.a: Construct and describe the effectiveness of a short-term interest rate 
tree assuming normally distributed rates, both with and without drift. 


Term Structure Model With No Drift (Model 1) 


This reading begins with the simplest model for predicting the evolution of short rates 
(Model 1), which is used in cases where there is no drift and interest rates are normally 
distributed. The continuously compounded instantaneous rate, denoted r, will change 


(over time) according to the following relationship: 
dr = odw 


where: 
dr = change in interest rates over small time interval, dt 
dt = small time interval (measured in years) (e.g., one month = 1/12) 
o = annual basis point volatility of rate changes 
dw = normally distributed random variable with mean 0 and 
standard deviation \/dt 


Given this definition, we can build an interest rate tree using a binomial model. The 
probability of up and down movements will be the same from period to period (50% up 


and 50% down) and the tree will be recombining. Since the tree is recombining, the up- 
down path ends up at the same place as the down-up path in the second time period. 


For example, consider the evolution of interest rates on a monthly basis. Assume the 
current short-term interest rate is 6% and annual volatility is 120bps. Using the 
notation just listed, rg = 6%, o = 1.20%, and dt = 1/12. Therefore, dw has a mean of 0 


and standard deviation of ./1/12 = 0.2887. 


After one month passes, assume the random variable dw takes on a value of 0.2 (drawn 
from a normal distribution with mean = 0 and standard deviation = 0.2887). Therefore, 
the change in interest rates over one month is calculated as: dr = 1.20% x 0.2 = 0.24% = 
24 basis points. Since the initial rate was 6% and interest rates “changed” by 0.24%, the 
new spot rate in one month will be: 6% + 0.24% = 6.24%. 


LO 13.b: Calculate the short-term rate change and standard deviation of the rate 
change using a model with normally distributed rates and no drift. 


In Model 1, since the expected value of dw is zero [i.e., E(dw) = 0], the drift will be zero. 
Also, since the standard deviation of dw = våt, the volatility of the rate change = o vät. 
This expression is also referred to as the standard deviation of the rate. 


In the preceding example, the standard deviation of the rate is calculated as: 
1.2% x V1/12 = 0.346% = 34.6 basis points 


Returning to our previous discussion, we are now ready to construct an interest rate 
tree using Model 1. A generic interest rate tree over two periods is presented in Figure 
13.1. Note that this tree is recombining and the ending rate at time 2 for the middle 
node is the same as the initial rate, rọ. Hence, the model has no drift. 


Figure 13.1: Interest Rate Tree With No Drift 


The interest rate tree using the previous numerical example is shown in Figure 13.2. 
One period from now, the observed interest rate will either increase with 50% 
probability to: 6% + 0.346% = 6.346% or decrease with 50% probability to: 6% - 
0.346% = 5.654%. Extending to two periods completes the tree with upper node: 6% + 
2(0.346%) = 6.692%, middle node: 6% (unchanged), and lower node: 6% - 2(0.346%) = 
5.308%. 


Figure 13.2: Numerical Example of Interest Rate Tree With No Drift 
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LO 13.c: Describe methods for addressing the possibility of negative short-term 
rates in term structure models. 


Note that the terminal nodes in the two-period model generate three possible ending 
rates: ro + 20 \/at, ro and rg — 20 våt. This discrete, finite set of outcomes does not 


technically represent a normal distribution. However, our knowledge of probability 
distributions tells us that as the number of steps increases, the terminal distribution at 
the nodes will approach a continuous normal distribution. 


One obvious drawback to Model 1 is that there is always a positive probability that 
interest rates could become negative. On the surface, negative interest rates do not 
make much economic sense (i.e., lending $100 and receiving less than $100 back in the 
future). However, you could plausibly rationalize a small negative interest rate if the 
safety and/or inconvenience of holding cash were sufficiently high. 


The negative interest rate problem will be exacerbated as the investment horizon gets 
longer, since it is more likely that forecasted interest rates will drop below zero. As an 
illustration, assume a ten-year horizon and a standard deviation of terminal interest 
rates of 1.2% x ./19 = 3.79%. It is clear that negative interest rates will be well within a 
two standard deviation confidence interval when centered around a current rate of 6%. 
Also note that the problem of negative interest rates is greater when the current level 
of interest rates is low (e.g. 4% instead of the original 6%). 


There are two reasonable solutions for negative interest rates. First, the model could 
use distributions that are always non-negative, such as lognormal or chi-squared 
distributions. In this way, the interest rate can never be negative, but this action may 
introduce other non-desirable characteristics such as skewness or inappropriate 
volatilities. Second, the interest rate tree can “force” negative interest rates to take a 
value of zero. In this way, the original interest rate tree is adjusted to constrain the 
distribution from being below zero. This method may be preferred over the first 
method because it forces a change in the original distribution only in a very low 
interest rate environment whereas changing the entire distribution will impact a much 
wider range of rates. 


As a final note, it is ultimately up to the user to decide on the appropriateness of the 
model. For example, if the purpose of the term structure model is to price coupon- 
paying bonds, then the valuation is closely tied to the average interest rate over the life 
of the bond and the possible effect of negative interest rates (small probability of 


occurring or staying negative for long) is less important. On the other hand, option 
valuation models that have asymmetric payoffs will be more affected by the negative 
interest rate problem. 


Model 1 Effectiveness 


Given the no-drift assumption of Model 1, we can draw several conclusions regarding 

the effectiveness of this model for predicting the shape of the term structure: 

= The no-drift assumption does not give enough flexibility to accurately model basic 
term structure shapes. The result is a downward-sloping predicted term structure 
due to a larger convexity effect. Recall that the convexity effect is the difference 
between the model par yield using its assumed volatility and the par yield in the 
structural model with assumed zero volatility. 


= Model 1 predicts a flat term structure of volatility, whereas the observed volatility 
term structure is hump-shaped, rising and then falling. 

= Model 1 only has one factor, the short-term rate. Other models that incorporate 
additional factors (e.g., drift, time-dependent volatility) form a richer set of 
predictions. 


= Model 1 implies that any change in the short-term rate would lead to a parallel shift 
in the yield curve, again, a finding incongruous with observed (nonparallel) yield 
curve shifts. 


Term Structure Model With Drift (Model 2) 


Casual term structure observation typically reveals an upward-sloping yield curve, 
which is at odds with Model 1, which does not incorporate drift. A natural extension to 
Model 1 is to add a positive drift term that can be economically interpreted as a 
positive risk premium associated with longer time horizons. We can augment Model 1 
with a constant drift term, which yields Model 2: 


dr = \dt + odw 


Let’s continue with a new example assuming a current short-term interest rate, ro, of 
5%, drift, A, of 0.24%, and standard deviation, o, of 1.50%. As before, the dw realization 
drawn from a normal distribution (with mean = 0 and standard deviation = 0.2887) is 
0.2. Thus, the change in the short-term rate in one month is calculated as: 

dr = 0.24% X (1/12) + 1.5% x 0.2 = 0.32% 


Hence, the new rate, r4, is computed as: 5% + 0.32% = 5.32%. The monthly drift is 
0.24% x 1/12 = 0.02% and the standard deviation of the rate is 1.5% x 1/12 = 0.43% 
(i.e. 43 basis points per month). The 2bps drift per month (0.02%) represents any 
combination of expected changes in the short-term rate (i.e. true drift) and a risk 
premium. For example, the 2bps observed drift could result from a 1.5bp change in 
rates coupled with a 0.5bp risk premium. 


The interest rate tree for Model 2 will look very similar to Model 1, but the drift term, 
Adt, will increase by Adt in the next period, 2Adt in the second period, and so on. This is 


visually represented in Figure 13.3. Note that the tree recombines at time 2, but the 
value at time 2, rp + 2Adt, is greater than the original rate, rọ, due to the positive drift. 


Figure 13.3: Interest Rate Tree With Constant Drift 
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Model 2 Effectiveness 


As you would expect, Model 2 is more effective than Model 1. Intuitively, the drift term 
can accommodate the typically observed upward-sloping nature of the term structure. 
In practice, a researcher is likely to choose rọ and à based on the calibration of 
observed rates. Hence, the term structure will fit better. The downside of this approach 
is that the estimated value of drift could be relatively high, especially if considered as a 
risk premium only. On the other hand, if the drift is viewed as a combination of the risk 
premium and the expected rate change, the model suggests that the expected rates in 
Year 10 will be higher than Year 9, for example. This view is more appropriate in the 
short run because it is more difficult to justify increases in expected rates in the long 
run. 


Ho-Lee Model 


LO 13.d: Construct a short-term rate tree under the Ho-Lee Model with time- 
dependent drift. 


The Ho-Lee model further generalizes the drift to incorporate time-dependency. That 
is, the drift in Time 1 may be different than the drift in Time 2; additionally, each drift 
does not have to increase and can even be negative. Thus, the model is more flexible 
than the constant drift model. Once again, the drift is a combination of the risk 
premium over the period and the expected rate change. The tree in Figure 13.4 
illustrates the interest rate structure and effect of time-dependent drift. 


Figure 13.4: Interest Rate Tree With Time-Dependent Drift 
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It is clear that if A, =A, then the Ho-Lee model reduces to Model 2. Also, it should not 
be surprising that A, and A, are estimated from observed market prices. In other words, 
ro is the observed one-period spot rate. A, could then be estimated so that the model 
rate equals the observed two-period market rate. A, could be calibrated from using rg 
and A, and the observed market rate for a three-period security, and so on. 


=) MODULE QUIZ 13.1 
1. Using Model 1, assume the current short-term interest rate is 5%, annual volatility is 
80bps, and dw, a normally distributed random variable with mean 0 and standard 
deviation, \/dt, has an expected value of zero. After one month, the realization of dw 
is —0.5. What is the change in the spot rate and the new spot rate? 


Change in spot New spot rate 
A. 0.40% 5.40% 
B. —0.40% 4.60% 
C. 0.80% 5.80% 
D. —0.80% 4.20% 


2. Using Model 2, assume a current short-term rate of 8%, an annual drift of 50bps, and 
a short-term rate standard deviation of 2%. In addition, assume the ex-post 
realization of the dw random variable is 0.3. After constructing a 2-period interest 
rate tree with annual periods, what is the interest rate in the middle node at the end 
of Year 2? 

A. 8.0%. 
B. 8.8%. 
C. 9.0%. 
D. 9.6%. 


MODULE 13.2: ARBITRAGE-FREE MODELS 


LO 13.e: Describe uses and benefits of the arbitrage-free models and assess the 
issue of fitting models to market prices. 


Broadly speaking, there are two types of models: arbitrage-free models and equilibrium 
models. The key factor in choosing between these two models is based on the need to 
match market prices. Arbitrage models are often used to quote the prices of securities 
that are illiquid or customized. For example, an arbitrage-free tree is constructed to 
properly price on-the-run Treasury securities (i.e., the model price must match the 
market price). Then, the arbitrage-free tree is used to predict off-the-run Treasury 
securities and is compared to market prices to determine if the bonds are properly 
valued. These arbitrage models are also commonly used for pricing derivatives based 
on observable prices of the underlying security (e.g.,, options on bonds). 


There are two potential detractors of arbitrage-free models. First, calibrating to market 
prices is still subject to the suitability of the original pricing model. For example, if the 
parallel shift assumption is not appropriate, then a better fitting model (by adding drift) 
will still be faulty. Second, arbitrage models assume the underlying prices are accurate. 
This will not be the case if there is an external, temporary, exogenous shock (e.g, 


oversupply of securities from forced liquidation, which temporarily depresses market 
prices). 


If the purpose of the model is relative analysis (i.e. comparing the value of one security 
to another), then using arbitrage-free models, which assume both securities are 
properly priced, is meaningless. Hence, for relative analysis, equilibrium models would 
be used rather than arbitrage-free models. 


Vasicek Model 


LO 13.f: Describe the process of constructing a simple and recombining tree for 
a short-term rate under the Vasicek Model with mean reversion. 


The Vasicek model assumes a mean-reverting process for short-term interest rates. The 
underlying assumption is that the economy has an equilibrium level based on economic 
fundamentals such as long-run monetary supply, technological innovations, and similar 
factors. Therefore, if the short-term rate is above the long-run equilibrium value, the 
drift adjustment will be negative to bring the current rate closer to its mean-reverting 
level. Similarly, short-term rates below the long-run equilibrium will have a positive 
drift adjustment. Mean reversion is a reasonable assumption but clearly breaks down in 
periods of extremely high inflation (i.e., hyperinflation) or similar structural breaks. 


The formal Vasicek model is as follows: 
dr = k(6 — r)dt + odw 


where: 

k =a parameter that measures the speed of reversion adjustment 
8 = long-run value of the short-term rate assuming risk neutrality 
r = current interest rate level 


In this model, k measures the speed of the mean reversion adjustment; a high k will 
produce quicker (larger) adjustments than smaller values of k. A larger differential 
between the long-run and current rates will produce a larger adjustment in the current 
period. 


Similar to the previous discussion, the drift term, A, is a combination of the expected 
rate change and a risk premium. The risk neutrality assumption of the long-run value of 
the short-term rate allows 6 to be approximated as: 


K 
B= Tr; + E 
where: 


n = long-run true rate of interest 


Let’s consider a numerical example with a reversion adjustment parameter of 0.03, 
annual standard deviation of 150 basis points, a true long-term interest rate of 6%, a 
current interest rate of 6.2%, and annual drift of 0.36%. The long-run value of the short- 
term rate assuming risk neutrality is approximately: 

0.360% _ 


0.03 ‘8% 


B = 6%+ 


It follows that the forecasted change in the short-term rate for the next period is: 
0.03(18% — 6.2%)(1/12) = 0.0295% 


The volatility for the monthly interval is computed as 1.5% x \/7/12 = 0.43% (43 basis 
points per month). 


The next step is to populate the interest rate tree. Note that this tree will not recombine 
in the second period because the adjustment in time 2 after a downward movement in 
interest rates will be larger than the adjustment in time 2 following an upward 
movement in interest rates (since the lower node rate is further from the long-run 
value). This can be illustrated directly in the following calculations. Starting with rg = 


6.2%, the interest rate tree over the first period is: 


Figure 13.5: First Period Upper and Lower Node Calculations 
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If the interest rate evolves upward in the first period, we would turn to the upper node 
in the second period. The interest rate process can move up to 7.124% or down to 
6.258%. 


Figure 13.6: Second Period Upper Node Calculations 
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If the interest evolves downward in the first period, we would turn to the lower node in 
the second period. The interest rate process can move up to 6.260% or down to 5.394%. 


Figure 13.7: Second Period Lower Node Calculations 
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Finally, we complete the 2-period interest rate tree with mean reversion. The most 
interesting observation is that the model is not recombining. The up-down path leads 
to a 6.258% rate while the down-up path leads to a 6.260% rate. In addition, the down- 
up path rate is larger than the up-down path rate because the mean reversion 
adjustment has to be larger for the down path, as the initial interest rate was lower 
(5.796% versus 6.663%). 


Figure 13.8: 2-Period Interest Rate Tree With Mean Reversion 
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At this point, the Vasicek model has generated a 2-period nonrecombining tree of short- 
term interest rates. It is possible to modify the methodology so that a recombining tree 
is the end result. There are several ways to do this, but we will outline one 
straightforward method. The first step is to take an average of the two middle nodes 
(6.258% + 6.260%) / 2 = 6.259%. Next, we remove the assumption of 50% up and 50% 
down movements by generically replacing them with (p, 1 - p) and (q, 1 - q) as shown 
in Figure 13.9. 


Figure 13.9: Recombining the Interest Rate Tree 
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The final step for recombining the tree is to solve for p and q and r™ and råd, p and q are 
the respective probabilities of up movements in the trees in the second period after the 


up and down movements in the first period. r“ and r“4 are the respective interest rates 
from successive (up, up and down, down) movements in the tree. 


We can solve for the unknown values using a system of equations. First, we know that 
the average of p x r™ and (1 - p) x 6.259% must equal: 


6.663% + 0.03(18% - 6.663%) (1/12) = 6.691% 


Second, we can use the definition of standard deviation to equate: 


pF — 6.691%) + (1 — p) (6.259% — 6.691%)? = 1.50% x 5 


We would then repeat the process for the bottom portion of the tree, solving for q andr 
dd If the tree extends into a third period, the entire process repeats iteratively. 


LO 13.g: Calculate the Vasicek Model rate change, standard deviation of the rate 
change, expected rate in T years and half-life. 


The previous discussion encompassed the rate change in the Vasicek model and the 
computation of the standard deviation when solving for the parameters in the 
recombining tree. In this section, we turn our attention to the forecasted rate in T years. 


To continue with the previous example, the current short-term rate is 6.2% with the 
mean reversion parameter, k, of 0.03. The long-term mean-reverting level will 
eventually reach 18%, but it will take a long time since the value of k is quite small. 
Specifically, the current rate of 6.2% is 11.8% from its ultimate natural level and this 
difference will decay exponentially at the rate of mean reversion (11.8% is calculated 
as 18% - 6.2%). To forecast the rate in 10 years, we note that 11.8% x e(9.03*19) = 
8.74%. Therefore, the expected rate in 10 years is 18% - 8.74% = 9.26%. 


In the Vasicek model, the expected rate in T years can be represented as the weighted 
average between the current short-term rate and its long-run horizon value. The 
weighting factor for the short-term rate decays exponentially by the speed of the mean- 
reverting parameter, 0: 


ree! + a1 — e+) 


if 
A more intuitive measure for computing the forecasted rate in T years uses a factor’s 
half-life, which measures the number of years to close half the distance between the 
starting rate and mean-reverting level. Numerically: 
(18% — 6.2%)e 0-5 = 4(18% — 6.2%) 
e003; = % — += In(2) / 0.03 = 23.1 years 
PROFESSOR’S NOTE 
“A larger mean reversion adjustment parameter, k, will result in a shorter half- 
life. 


Vasicek Model Effectiveness 


LO 13.h: Describe the effectiveness of the Vasicek Model. 


There are some general comments that we can make to compare mean-reverting 
(Vasicek) models to models without mean reversion. In development of the mean- 
reverting model, the parameters rp and 0 were calibrated to match observed market 


prices. Hence, the mean reversion parameter not only improves the specification of the 
term structure, but also produces a specific term structure of volatility. Specifically, the 
Vasicek model will produce a term structure of volatility that is declining. Therefore, 
short-term volatility is overstated and long-term volatility is understated. In contrast, 
Model 1 with no drift generates a flat volatility of interest rates across all maturities. 


Furthermore, consider an upward shift in the short-term rate. In the mean-reverting 
model, the short-term rate will be impacted more than long-term rates. Therefore, the 
Vasicek model does not imply parallel shifts from exogenous liquidity shocks. Another 
interpretation concerns the nature of the shock. If the shock is based on short-term 
economic news, then the mean reversion model implies the shock dissipates as it 
approaches the long-run mean. The larger the mean reversion parameter, the quicker 
the economic news is incorporated. Similarly, the smaller the mean reversion 
parameter, the longer it takes for the economic news to be assimilated into security 
prices. In this case, the economic news is long-lived. In contrast, shocks to short-term 
rates in models without drift affect all rates equally regardless of maturity (i-e., produce 
a parallel shift). 


MODULE QUIZ 13.2 
1. The Bureau of Labor Statistics has just reported an unexpected short-term increase in 


high-priced luxury automobiles. What is the most likely anticipated impact on a 
mean-reverting model of interest rates? 
A. The economic information is long-lived with a low mean reversion parameter. 
B. The economic information is short-lived with a low mean reversion parameter. 
C. The economic information is long-lived with a high mean reversion parameter. 
D. The economic information is short-lived with a high mean reversion parameter. 


. Using the Vasicek model, assume a current short-term rate of 6.2% and an annual 


volatility of the interest rate process of 2.5%. Also assume that the long-run mean- 
reverting level is 13.2% with a speed of adjustment of 0.4. Within a binomial interest 
rate tree, what are the upper and lower node rates after the first month? 


Upper node Lower node 
A. 6.67% 5.71% 
B. 6.67% 6.24% 
C. 7.16% 6.24% 
D. 7.16% 5.71% 


. John Jones, FRM, is discussing the appropriate usage of mean-reverting models 


relative to no-drift models, models that incorporate drift, and Ho-Lee models. Jones 
makes the following statements: 


Statement 1: Both Model 1 (no drift) and the Vasicek model assume parallel shifts 
from changes in the short-term rate. 


Statement 2: The Vasicek model assumes decreasing volatility of future short-term 
rates while Model 1 assumes constant volatility of future short-term 
rates. 


Statement 3: The constant drift model (Model 2) is a more flexible model than the Ho- 
Lee model. 


How many of his statements are correct? 
A. 0. 


B. 1. 
C. 2. 
D. 3. 


KEY CONCEPTS 


LO 13.a 
Model 1 assumes no drift and that interest rates are normally distributed. The 
continuously compounded instantaneous rate, r, will change according to: 
dr = odw 
Model 1 limitations: 


= The no-drift assumption is not flexible enough to accommodate basic term structure 
shapes. 


=u The term structure of volatility is predicted to be flat. 
« There is only one factor, the short-term rate. 


= Any change in the short-term rate would lead to a parallel shift in the yield curve. 


Model 2 adds a constant drift: dr = Adt + odw. The new interest rate tree increases each 
node in the next time period by Adt. The drift combines the expected rate change with a 
risk premium. The interest rate tree is still recombining, but the middle node rate at 
Time 2 will not equal the initial rate, as was the case with Model 1. 


Model 2 limitations: 
= The calibrated values of drift are often too high. 


= The model requires forecasting different risk premiums for long horizons where 
reliable forecasts are unrealistic. 


LO 13.b 


The interest rate tree for Model 1 is recombining and will increase/decrease each 
period by the same 50% probability. 


LO 13.c 


The normality assumption of the terminal interest rates for Model 1 will always have a 
positive probability of negative interest rates. One solution to eliminate this negative 
rate problem is to use non-negative distributions, such as the lognormal distribution; 
however, this may introduce other undesirable features into the model. An alternative 
solution is to create an adjusted interest rate tree where negative interest rates are 
replaced with 0%, constraining the data from being negative. 


LO 13.d 


The Ho-Lee model introduces even more flexibility than Model 2 by allowing the drift 
term to vary from period to period (i.e., time-dependent drift). The recombined middle 
node at Time 2 = rọ + (A, + Ap)dt. 


LO 13.e 


Arbitrage models are often used to price securities that are illiquid or off-market (e.g., 
uncommon maturity for a swap). The more liquid security prices are used to develop a 
consistent pricing model, which in turn is used for illiquid or non-standard securities. 
Because arbitrage models assume the market price is “correct,’ the models will not be 
effective if there are short-term imbalances altering bond prices. Similarly, arbitrage- 


free models cannot be used in relative valuation analysis because the securities being 
compared are already assumed to be properly priced. 


LO 13.f 
The Vasicek model assumes mean reversion to a long-run equilibrium rate. The specific 
functional form of the Vasicek model is as follows: 

dr = k(@ — r)dt + odw 


The parameter k measures the speed of the mean reversion adjustment; a high k will 
produce quicker (larger) adjustments than smaller values of k. Assuming there is a 
long-run interest rate of r,, the long-run mean-reverting level is: 


0 = r+ 


The Vasicek model is not recombining. The tree can be approximated as recombining by 
averaging the unequal two nodes and recalibrating the associated probabilities (i.e., no 
longer using 50% probabilities for the up and down moves). 


LO 13.g 


The expected rate in T years can be forecasted assuming exponential decay of the 
difference between the current level and the mean-reverting level. The half-life, t, can 
be computed as the time to move halfway between the current level and the mean- 
reverting level: 


(8 — rpe = (8 — 1p) 
LO 13.h 
The Vasicek model not only improves the specification of the term structure, but also 
produces a downward-sloping term structure of volatility. Model 1, on the other hand, 
predicts flat volatility of interest rates across all maturities. Model 1 implies parallel 
shifts from exogenous shocks while the Vasicek model does not. Long- (short-) lived 
economic shocks have low (high) mean reversion parameters. In contrast, in Model 1, 
shocks to short-term rates affect all rates equally regardless of maturity. 


ANSWER KEY FOR MODULE QUIZZES 


Module Quiz 13.1 


1.B Model 1 has a no-drift assumption. Using this model, the change in the interest 
rate is predicted as: 
dr = odw 
dr = 0.8% x (-0.5) = -0.4% = -40 basis points 
Since the initial rate was 5% and dr = —-0.40%, the new spot rate in one month is: 
5% - 0.40% = 4.60% 
(LO 13.a) 


2.C Using Model 2 notation: 


current short-term rate, rg = 8% 

drift, A = 0.5% 

standard deviation, o = 2% 

random variable, dw = 0.3 

change in time, dt = 1 
Since we are asked to find the interest rate at the second period middle node 
using Model 2, we know that the tree will recombine to the following rate: rp + 
2Adt. 

8% + 2 x 0.5% x 1=9% 
(LO 13.a) 


Module Quiz 13.2 


1.D The economic news is most likely short-term in nature. Therefore, the mean 


2.D 


3. B 


reversion parameter is high so the mean reversion adjustment per period will be 
relatively large. (LO 13.h) 


Using a Vasicek model, the upper and lower nodes for Time 1 are computed as 


follows: 


(0.4)(13.2% — 6.2%) 2.5% l 
upper node = 6.2% + GRE — - 4. z” 7.16% 
a Wid 


(0.4)(13.2% —6.2%) 25% __ 
lower node = 6.2% + ———__,_ ——= 5.71% 
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(LO 13.f) 


Only Statement 2 is correct. The Vasicek model implies decreasing volatility and 
nonparallel shifts from changes in short-term rates. The Ho-Lee model is actually 
more general than Model 2 (the no drift and constant drift models are special 
cases of the Ho-Lee model). (LO 13.f) 


The following is a review of the Market Risk Measurement and Management principles designed to address the 


learning objectives set forth by GARP®. Cross-reference to GARP assigned reading—Tuckman and Serrat, Chapter 
10. 


READING 14 


THE ART OF TERM STRUCTURE 
MODELS: VOLATILITY AND 
DISTRIBUTION 


Study Session 3 


EXAM FOCUS 


This reading incorporates non-constant volatility into term structure models. The 
generic time-dependent volatility model is very flexible and particularly useful for 
valuing multi-period derivatives like interest rate caps and floors. The Cox-Ingersoll- 
Ross (CIR) mean-reverting model suggests that the term structure of volatility 
increases with the level of interest rates and does not become negative. The lognormal 
model also has non-negative interest rates that proportionally increase with the level 
of the short-term rate. For the exam, you should understand how these models impact 
the short-term rate process, and be able to identify how a time-dependent volatility 
model (Model 3) differs from the models discussed in the previous reading. Also, 
understand the differences between the CIR and the lognormal models, as well as the 
differences between the lognormal models with drift and mean reversion. 


MODULE 14.1: TIME-DEPENDENT VOLATILITY 
MODELS 


LO 14.a: Describe the short-term rate process under a model with time- 
dependent volatility. 


This reading provides a natural extension to the prior reading on modeling term 
structure drift by incorporating the volatility of the term structure. Following the 
notation convention of the previous reading, the generic continuously compounded 
instantaneous rate is denoted r, and will change (over time) according to the following 
relationship: 


dr = \(t)dt + o(t)dw 


It is useful to note how this model augments Model 1 and the Ho-Lee model. The 
functional form of Model 1 (with no drift), dr = odw, now includes time-dependent drift 
and time-dependent volatility. The Ho-Lee model, dr = A(t)dt + odw, now includes non- 
constant volatility. As in the earlier models, dw is normally distributed with mean 0 
and standard deviation vät. 


LO 14.b: Calculate the short-term rate change and determine the behavior of the 
standard deviation of the rate change using a model with time dependent 
volatility. 


The relationships between volatility in each period could take on an almost limitless 
number of combinations. For example, the volatility of the short-term rate in one year, 
o(1), could be 220 basis points and the volatility of the short-term rate in two years, 
o(2), could be 260 basis points. It is also entirely possible that o(1) could be 220 basis 
points and o(2) could be 160 basis points. To make the analysis more tractable, it is 
useful to assign a specific parameterization of time-dependent volatility. Consider the 
following model, which is known as Model 3: 


dr = \(t)dt + ce “dw 

where: 

o = volatility at t = 0, which decreases exponentially to 0 for a > 0 
To illustrate the rate change using Model 3, assume a current short-term rate, ro, of 5%, 
a drift, A, of 0.24%, and, instead of constant volatility, include time-dependent volatility 
of oe 93 (with initial o = 1.50%). If we also assume the dw realization drawn from a 


normal distribution is 0.2 (with mean = 0 and standard deviation = \/1/12 = 0.2887), the 
change in the short-term rate after one month is calculated as: 


dr = 0.24% x (1/12) + 1.5% x e 2-112 x 0.2 
dr = 0.02% + 0.29% = 0.31% 


Therefore, the expected short-term rate of 5% plus the rate change (0.31%) equals 
5.31%. Note that this value would be slightly less than the value assuming constant 
volatility (5.32%). This difference is expected given the exponential decay in the 
volatility. 


Model 3 Effectiveness 
LO 14.c: Assess the efficacy of time-dependent volatility models. 


Time-dependent volatility models add flexibility to models of future short-term rates. 
This is particularly useful for pricing multi-period derivatives like interest rate caps 
and floors. Each cap and floor is made up of single period caplets and floorlets 
(essentially interest rate calls and puts). The payoff to each caplet or floorlet is based 
on the strike rate and the current short-term rate over the next period. Hence, the 
pricing of the cap and floor will depend critically on the forecast of o(t) at several 
future dates. 


It is impossible to describe the general behavior of the standard deviation over the 
relevant horizon because it will depend on the deterministic model chosen. However, 
there are some parallels between Model 3 and the mean-reverting drift (Vasicek) 
model. Specifically, if the initial volatility for both models is equal and the decay rate is 
the same as the mean reversion rate, then the standard deviations of the terminal 
distributions are exactly the same. Similarly, if the time-dependent drift in Model 3 is 
equal to the average interest rate path in the Vasicek model, then the two terminal 
distributions are identical, an even stronger observation than having the same terminal 
standard deviation. 


There are still important differences between these models. First, Model 3 will 
experience a parallel shift in the yield curve from a change in the short-term rate. 
Second, the purpose of the model drives the choice of the model. If the model is needed 
to price options on fixed-income instruments, then volatility dependent models are 
preferred to interpolate between observed market prices. On the other hand, if the 
model is needed to value or hedge fixed-income securities or options, then there is a 
rationale for choosing mean reversion models. 


One criticism of time-dependent volatility models is that the market forecasts short- 
term volatility far out into the future, which is not likely. A compromise is to forecast 
volatility approaching a constant value (in Model 3, the volatility approaches 0). A 
point in favor of the mean reversion models is the downward-sloping volatility term 
structure. 


2) MODULE QUIZ 14.1 
— 1. Regarding the validity of time-dependent drift models, which of the following 
statements is(are) correct? 

I. Time-dependent drift models are flexible since volatility from period to period can 
change. However, volatility must be an increasing function of short-term rate 
volatilities. 

II. Time-dependent volatility functions are useful for pricing interest rate caps and 
floors. 
. I only. 
. II only. 
. Both I and II. 
. Neither I nor II. 


MODULE 14.2: COX-INGERSOLL-ROSS (CIR) AND 
LOGNORMAL MODELS 


Jawe 


LO 14.d: Describe the short-term rate process under the Cox-Ingersoll-Ross 
(CIR) and lognormal models. 


LO 14.e: Calculate the short-term rate change and describe the basis point 
volatility using the CIR and lognormal models. 


Another issue with the aforementioned models is that the basis point volatility of the 
short-term rate is determined independently of the level of the short-term rate. This is 


